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3oaie of the aistoricai developm^ent s which have » 
influenced educational evaluation are the importance of 
operationalisai and oehavioral objectives; curriculum reform; the 
Elementary ana Secondary Education Act; and school and teacher 
accountability, tiajor problems related to current evaluation 
practices inG:.,ade negative attitudes toward evaluation, and the lack 
or theory and adequate guidsiines for evaluation. Traditional 
conceptual schemes of evaluation include- measurement as evaluation; 
deteraiination of congruence; i>iofessional judgment ;. and applied 
research, though most evoxudtors and researchers implicitly support a 
separation oetBean these two moues of inquiry. Current approaches to 
evaluation and evaluatior* moae-ts include Provus' discrepancy 
evaluation model, Staxe'^ antecedents-transactions-outcomes oiodel, 
and tne context, -.nput, process, product (CIPPI model. These 
decision-oriented evaluation moaels can be contrasted with 
vaxuti-oriented eva^uatioa, oa&'^d on Dewey's conceptualization of 
valuation. Out of this tneoietica* base, naturalistic evaluation 
m«thoas have arisen, incluaing the responsive, judicial, 
trdnsactionax, connoisseurship, and illumination models. 
Concurrently, systems-oriented evaluation has also been developed, to 
interconnect the planning, aeveiopmeat, and evaluation process. All 
Eoacnes nave implications for needs assessment and policy 
(Over 3J iiioliograpnica* references are appended. J (GDC) 
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X. Overview of Evaluation 

One purpose of educational evaluation is to provide decision makers with 
iniormation about the effectiveness of -an educational program, product, or 
procedure. Vlithin this perspective, evaluation is viewed as a process in 
which data are obtained, analyzed, and synthesized into relevant information 
for decision making. 

While most evaluation activities fit comfortably within the bounds of 
this definition, the specific approach used and procedures employed vary from 
one evaluation study to another as a function of who is doing the evaluation, 
the context in which the evaluation is to occur and the desires and needs of 
the individual or agency contracting the evaluation. While there is basic 
a.jreement about the fundamental role of evaluation in education, beyond this 
there is considerable variance in the conceptual frameworks used by practitioners. 

> 

Indeed, even the W£iys in which evaluation has been defined in the literature 
has prodiiced considerabLo debate. 

Mloom, Hastings and Madaus (ly71) point to five different facets of evalua- 
ti<-n, r.-^t. all of which aro included in other definitions. These authors pose 
a br.M.i viow of evaluation consisting of the following activiti.as: 

1. Acquiring an-1 processing the evidence needed to improve 
the nuudjnts learning and the teaching. 

^Vho author would Liko r.o f-iank Ron Jemelka for his many significant contribu- 
tions t;^ this i.upor, uapccialiy those p^^rtaining to the concept of value-oriented 
.^valuatrion which wxiL appear in Jomelka, R. and G. Borich, Traditional and 
:-.:Ttorginq Jetinitions ut Educational Kvaluation. Evaluat ion Quarterly , in press 
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2. Employing a groat variety of evidence beyond the final 
paper and pencil eKamination, 

3. Clarifying tHe significant goals and objectives of education 
and determining the extent to which the students are 
developing in these desired ways. 

4. Instituting a system of quality control in which it may be 
determined at each step in the teaching-learning process 
whether the process is effective or not and if not, what 
changes must be made to insure effectiveness. ^ 

5. And, ascertaining whether alternative procedures are equally 
effective or not in achieving a set of educational ends 

(p. 7-8). 

As general as these activities may appear, they are not the only purposes 
for which evaluations can be conducted. Stuff lebeam et al. (1971), for example, 
divide evaluation into a four part process consisting of context, input, 
process and product evaluations, each with its own objectives and methods, 
while Provus (1971), Stake (1967), Hammond (1973), Metfessel and Michael (1967) 
as well as others concoptuaLize and partition the process, if not the domain, 
oi* f^valuation in still othor ways. 

With trvaluators differing on such basic issues, it is not surprising that 
one can find numorous ovaiuation paradigms or "models" in the literature to 
holp shapo and -Tuido ovaiuation activities. The problem for the evaluator 
h»^ct')mos one of choosinq the concoptualizat ion or model most appropriate to his 
ovaiuation nroblom. H("?cause tho ovaiuation models appeariny in the literature 
aro purposely general so as to be applicable to a wide variety of educational 
problems, the task oi chv^osinq thai conceptualization of evaluation most 



appropriate^ to a specific purpose becomes oven more arduous* One focus of 
this paper will be to trace the origins of the problem of choosing the coxrect 
conceptual i;£at ion or model for an evaluation and to identify some of the 
underlying factors which have contributed to the heterogeneity of opinion • 
concerning the definition, nature and scope o£ educational evaluation. 

To. this end I will present an overview of some historical developments 
which have influenced the growth of educational evaluation. This chronology 
will provide the foundation for an interpretation of contemporary movements 
in the Cieid and the extrapolation of these movements to the not-to-distant 
future. 

Before preceding a personal note is in order. I have struggled in this 
writing to keep separate the idea of where the field of evaluation is going 
from the idea of where this author believes it should be going. As most 
authors will attest any writing is inextricably tied to the author's background,, 
training and philosophy and this chapter is no exception. As Kuhn (1970) has 
made us painfully aware "an apparently arbitrary element, compounded of 
perijonal and hijjtorical accident, is always a formative ingredient of the 
beliefs espoused by a given scientific community (and scientist) at a given 
time. .among thosio laqitimate pofisibiliuies, the particular conclusions he 
doos arrive at aro probably determined by his prior experience in other fields, 
by accidents of hi?; investigation, and by his own individual makeup" (p, 4). 
iNUhn'j; observation leads us to ask who might be the wiser: the scientist who 
writes about his field influenced by his own implicit biases and the philosophy 
of hl;5 scientific community or the objective scholar who chronicles the 
accomj li^hments of a discipline with which he has only fundamental knowledge? 
When nuv Country cho5;e tlie Swedish sociologist Gunnar Mydal to write an objectJ,ve 



report on the status of Che American NogrOi it clearly valuod the view of 
an outsider. While it is difficult to meai^uru the cont^iquences* of tuther 
approach.^ history has shown the value of each. Where the reader feels my 
own Interpretation is. only one interpretation that may be made from these 
historical trends, he or she will no doubt be correct* 



II. Where We Are Now; History and Carr^jnt Status of Evaluation 

* 

This section brief 1^ reviews the history of educational evaluation^ presents 
the roles evaluation has traditionally played in education, and summariaes the 
current status of the fi*?ld. 

Educational Developments and Societal Trends Influencing the Growth and Development 
of Evaluation . 

In the first three decades of this century the measurement of human abilities 
grew out of early work by Binet, Thorndike, and Thurstone. This newly developed 
measurement technology had much appeal to educators and was assimulated into 
educational practice, giving rise to the development of standardized achievement 
tests which made possible ^.arge scale testing programs. The accreditation move- 
ment also. flourished during this early period and with the development of formal 
accrediting policies for colleges and Schools, program evaluation gained a foothold 
in education. Later, the Educational Testing Service (ETS) established in 1947 
and a national system of research and development centers and laboratories 

established in 1966 provided additional momentum to the fi'^^id of evaluation 
through evaluation projects and contributions to evaluation methodology. 
(See Borich, 1974, and I'oynor, 1974, for a selection of evaluation contribu- 
tions from Lhese centers and laborat-ories) . 

ImrAc t of Operationalism and the Behavioral Objectives Movement * 

The concept of behavioral objectives has held a position of importance in 

the field of evaluation for almost half a century. One origin of the concept of 

behavioral objectives can be traced to a book by Bridgman (1927) titled the 

*1 am indebted to Bloom, Hastings and Madaus (1971) for the early origins of 
this movement. 



Logic of Modern Physics > In his book Dridgiuan pointed to the? need to define 
new constructs by describing the operations used to measure them. Bridgman's 
concept offered an alternative to the practice of defining constructs by their 
apparent commonality or lack of commonality with other constructs which, earlier 

had been defined in the same manner. Through the efforts of Bridgman and 

* 

parallel efforts of others the idea of operationally defining constructs became 
incorporated into the behavioral sciences, where constructs such as "motivation/* 
"anxiety/* and ''learning" were redefined in terms of the measurement operations 
used to observe them. Other frequently used constructs, such as the construct 
''insight/* took on mostly theoretical significance for lack of practical and 
reliable means o£ measuring them. This process of tying constructt definitica to 
construct measurement became an integral part of the school known as behaviorism 
to which the behavioral objectives movements owes its beginning. 

The application of operationalism to education resulted in the outgrowth of 
two distinct but related movements. The first is typified by Tyler's Eight 
Year Study of secondary Education for the Progressive Education Association 
(Jmith and Tyler # 1942) in which behavioral objectives were extensively used to 
evaluate "progressive" attempts to apply new curricula and approaches to instruc** 
tion. Tyler's contribution is significant not only because it offered th^ first 
exami'l?^ of hew behavioral objectives could be used to construct evaluation 
inatrumonts and to appraise the effectiveness of curricula but also because 
it provided the impetus for many developments in the field which were to follow. 
Some of triti more noteworthy of thest^ were the Taxonomy of Educational Objectives 
in the CoqnitivQ Domain (Bloom, et al. 19S6) and Affective Domain (Krathwohl et 
al., i3b4) and a popular book by Maqcir (1962) on how to write educational oh^ 
jertivey* These voiiimos, in turn, stimulattfd an extensive literature on behavioral 



obioctives, bpth in support of and critical of .their application in the schools 

* 

(Pophfim, 1969, Eisner, 1%9). 

A second movement -rooted in a behavipristic philosophy was the programmed 
instruction and related computer assisted instruction movement of the la±e 19£0's 
and r^eO's. Behaviorally stated objectives were central to both these forms of- 
instruction. The developalent of j^rogrcimmed and computer assisted instruction 
depended heavily on the specification and breaking down of content into discrete 
learnabie units having measureable outcomes # for which the concept of behavioral 
objectives was ideally suited. In this behavioralistic setting, several large 
^development and evaluation projects were begun. Of particular note were evaluations 
of the Plat o and Ticcit computer assisted instruction projects designed to study 
the cost and effectiveness of computer based instruction. for teaching large numbers 
ot geographically dispersed students. (See Alderman, 1978; Murphy and Appel, 1977j 

and Orlansky and String, 1978, for evaluations of these and other computer based 

' -. 
instruction projects.) 

The Impact of the Curriculum Reform Movement 

A major impetus to the development of evaluation wa?? the curriculum reform 
movement. Spanning roughly the docades of the 1950.' s and 1960's the curriculum 
r-^rorm movement was characterized by widespread change in the philosophy, techniques 
and materials used in torching elementary and secondary school children. Most nota- 
ble were thji changes which occurrGd in the sciences shor';ly after the 1957 launching 
of tn-j Soviet satelite, Sputnik. Prior to this unsettling event, curricula for the 
iniljliv.' ijchoois were; wiitten rrim-^rily by individuals, authoring textbooks which 
r;h.inq'.Kl only slightly t'.io style and content of earlier versions. Due partly 
to thtj inability of any sintjle author to undertake major curriculum reform and 
partly to the liability to oneself and publisher such reform might present if not 
saloaolt;, curriculum chinqes were slow and for the most part conservative. With 
.=?ovior. competition in thn sciences, however, came the impetus for the federal 



government to play an increasing role in the field of education, at first 
through the vehicle of the National Science Foundation and later through the 
effort's of the U.S. Office of Education and the National Institute of Education. 
The post-Sputnik era provided the context^ for new initiatives in the design 
and development of curricular materials, particularly in the fields of science 
and mathematics. These initiatives represented not only an effort to reform 
certain segments of the school curriculum but also to try new approaches to 
curriculum development which placed decreasing emphasis on the individual author 
and increasing emphasis ^n, teams of specialists brought together by public monies 
specifically for the purpose of infusing the school curriculum with the latest 
scientific advances . New content and innovative ways of presenting it became ' 
more palatable with the burden of risk for a development project being shared 
by teams of specialists sponsored by government monies. Even more appealing 
was the fact that often extensive discussions, symposia, and workshops would 
"accompany these development projects for the purpose of giving teachers and 
scientists a significant role in the design and selection of content. This 
unique integration of theory and practice became a key, element in a 
process which was to become characteristic of the curriculum reform movement. 

••' Also of significance was the fact that with the systematic approach to curri- 
culum development the previcasiy isolated concepts of development and evaluation 
became parts of a unitary process. Because of the experimental nature of much 
of the content and approaches used, pilou and field testing of instructional 
components became logical extensions of the curriculum development effort. It 
was in this context that projects such as the Biological Sciences Curriculum 
Jtudy (BSCS), The Chemical Education Materials Study (Chem Study), the Physical 
^^-lonce study Commit. tee (PSSC) and School Mathematics Study Group (SMSG) were 



born. These projects contributed -significantly to the field of evaluatiCii oy 
, ^etnploying development strategies which required the repeated testing and revision 
of components parts of the curriculum* This process of testing n^ell-defined 
units of a curriculunt during development for purposes of, revision and modif ica- ' 
tion was later to be coined "formative evaluation** by Scriven (1967) . (See * 
Grobmanf 19^6^, for^'a review of the curriculum reform movement and a history of 
the Biological Sciences Curriculum Study) » 

The significant role which evaluation played in these prdjects stimulated 
efforts ^t several universities to mount doctoral training programs in the 
area of evaluation. Training programs were begun at the Ohio State University 
influenced principally by Professor Stufflebeam (now at Western Michigan Univer- 
sity) , the University of Illincis influenced principally by Professor Stake and 

A 

at the University of Virginia influenced principally by the late Professor 
Provus* In addition each of these individuals developed in conjunction with 
his training curriculuin an evaluation model which could be used in evaluating 
educational programs and curricula. These models would later figure centrally 
in the development of the field of evaluation • 
The Impact of ESEA 

Despite the influence of the behavioral objectives and curriculum reform 
movements, there was still relatively little emphasis placed on the evaluation 
of educational programs by the mid 1960's. It was within this context that 
the U.S. Congress began debate on the Elementary and Secondary Education Act 
of 1965 (ESEA) . This comprehePb\ve and ambitious educational legislation was 
to make available large sum? of money in the' form of grants to universities and lo 
cal. education agencies for edccational materials, 'development and research. 
As the bill was debated, concern was expressed that there were no assurances 



that the federal monies made available would actually result in improvements 

» ♦ 

iu the quality of education. This concern was perhaps magnified by tho general 
belief that, in the past, educators had. done a^-poor job of accounting for the 
federal money they spent • * * . 

.Motivated by this concprn, tho Congress insisted on a i)roviGion to FSEA 
requiring that evaluation repoiits be submitted by grantees reporting the 
impact of their programs. These guidelines were conveyed to 
prospective grantees in an ESEA Title III manual published by the U.S. Office • 
of Education# requiring the applicant to; 

A. Where applicable^ describe the methods # techniques and 
objectives which will be used to determine the degree 
to which the objectives of the proposed program are 
achieved. 

.B. Describe the instruments to be used to conduct the 

evaluation, and 
C. Provide a separate estimate of costs for evaluation 

purposes. (p. 48) 

Although the final version of the bill did not require evaluation of all 
the programs (titles) under ESEA, there was a clear :aandate from those providing 
tcvieraL funds for education that programs utilising these funds be accountable 
for the educational programs, products, and procedures they developed and/or 
iijplemonted. For tho first time educators were required to devote time and 
resources to* evaluating their own efforts. 

This emphasis on accountability became evident again in 1971, when a rider 
was placed on leqislation requiring that all^ ESEA projects be evaluated by the 
ijrarit^n:!. The curronL popularity of sunset" and **sunshine" policies and zero- 
based budgeting among both state and federal funding agencies reflects this 
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continued emphasis on accountability. These policies require the recipients 
of funds to justify refunding of their program each year or program cycle and 
to make program decisions and expenditures a matter of public record* ^ 
Impact of school and Teacher A,ccountabilitv 

The concej.tt of school and teacher accountability emerged as an outgrowth of the 
J^SEA Legislation of 1965 and 1971. Federal agencies and grantees responsible for 
innovative ESEA programs were only the first to feel the pressure for accounta- 
bility. Because many of these programs dealt directly with the schools # the 

> 

accountability demanded of them^ also raised questions: about the school staff 
who played a prominent role in their implementation. Consequently » teaching 
effectiveness and the administrative accountability of schools in general often 
became the focus of attempts to monitor and evaluate federally funded programs* - 
The concepts of "accountability/' "cost*-benefit/* and "quality assurance/' 
filtered down in spirit, if not in substance, to. the local school and teacher « 
' By 1970 community pressures began to bear down on the local school # often 
demanding accountability In terms of pupil outcome. In some cases school 
administrators responded to these pressures by concentrating on the more obvious 
indicators of effectiveness, such as pupil performance on national achievement 
tests, number of college admissions, and National Merit scholarships. Others 
tcqan exploring ways co make cost-effective^ decisions ab >ut the operation and 
maiiavjerrent of their school in order to prove that increased revenues actually ^ 
produced more effective teaching rnd learning. School administrators -embraced 
accountability procedures in answer to community pressures for more objectively 
determined and effective ways to spend school revenue and to make internal deci- 

j 

. bions that could be defended to school boards, PTA's and professional groups^ 

It wr.i£; within this context of widespread community concern about higher but 
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ipparontly unproductive, school expenditures that somo state governments began 
discussing legislation requiring the appraisal of school-district personnel. 
A prime example of state-enacted accountability legislatioji v;as California's 

« 

Stall Act passed in 1971, requiring that school boards in that state evaluate 
their educators yearly and provide recommendations for their professional develop- 

» 

me^nt. The Stull Act qaye local communities a mandate to develop procedures for 
appraising school district personnel and for periodically reporting appraisal 
data back to the teacher in order to upgrade his or her performance, A major 
impact of the school and teacher accountability movement on the general field of 
evaluation has been in the area of process evaluation. In order to evaluate the 
performance of teachers, reasearchers have operationally defined a large mamber 
of teacher behaviors or "competencies" which have shown to relate to pupil achieve- 
ment. Many of these teacher behaviors and related instrumentation have been used 
by evaiuators to study the processes with which instructional staff implement 
educational programs and curricula. (See Borich^ 1977, for other contributions of 
the school and teacher accountability movement.) 

A summary of the contributions to evaluation associated with operationalism, 
curriculum c^eform, ESEA Legislation and school accountability appears in Table 1. 



Table 1 



oome Contributions Associated with 



Four Miiostones in th(i Field of Evaluation 



Mi Ivjstones 



I . optirat lona I ism 
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Contributions 



Dofinininq constructii by tht? procedures 



used to measure thorn 



Use of behavioral objectives for program 



design and evaluation 
Programmed instruction 
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Table L 
(contijUued) 

{1. Operational ism) 
2* Curriculum reform 



i. Elementary and Secondary 
Education Act of 1965 



4. School accountability 



Computer assisted instruction 
Increased federal expenditure in 

education \ 
New initiatives in instructional 

techniques and materials 
Cooperation of scientists and teachers 

< 

on the design of curricula 
Integration of curriculum development 

and evaluation as a unitary process 

(formative evaluation) 
Doctoral training programs ir evaluation 
Federal commitment to evaluation 
Federal 1 mandated and funded evaluations 
The principle of refunding contingent 

on evaluation results 
Project accountability at the local level 
Teacher and administrator accountability 
Pupil behavior as criterion of program 

(teacher) success 
r>tate mandated evaluations 
Process evaluation techniques and 

instruments 
Evaluation as feedback for professional 

develoDment . 



4 . 
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Response to the Demand for Effective Evaluation 

Although citizens were generally positive About the explicit mandates con- 
tained in ESEA legislation and California's Stull Act, it became evident by 
mid 1970 that educators were not prepared to effectively implement either of 
these ntjv mandates. Moreover, the sudden increase in demand for capable 
evaluators brought about by these mandates quickly exhausted the supply. Few 
educators had any formal training in evaluation and often local school personnel 
were pressed into service as program evaluators. 

One obstacle to the implementation of these mandates was the inability of 
local, state and federal administrators to apply the mandates. The evaluation 
concepts created by educators in the preceding decade no longer seemed adequate 
to answer the questions which now were being asked of these programs. Aft<^ 
reviewing the evaluation reports of ESEA programs, Cuba (1969) concluded that# 
The traditional methods of evaluation have failed educators in 
their attempts to assess thfs impact of innovations in operating 
systems. Indeed, for decades the evidence produced by the 
application of conventional evaluation procedures has contra- 
dieted the experiential evidence of the practitioner. Innova- 
tions have persisted in education not because of the supporting 
tivuience of ♦^valuation but despite it. (p» 28) 
'AvA It iinothe-r p>oira argued that, 

Wht-n tlui '»vi(ien(.e piod.uctjd by any scientific concept or technique 
cjfit mua Lly t\iil.i to affirm experiential observation and theory 
aiisinq from ^hat obfua'vatior*, the technique ma/ itself appro- 
priately bo called into question, (p. 30) 



ERIC 
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With tho emergence of ESEA came not only a need for new management strate- 
giea to monitor these programs but also a need fcr improved evaluation designs 
to test their effectiveness. 

Reflecting on the current state of evaluation practice the report of the 
Phi Delta Kappa (PDK) national study coiwnittee on evaluation (Stufflebeam, 
Foley, Gephart, Cuba, Hammond, Merriman and Provus, 1971) concluded that 
evaluation was "seized with a great illness" (p. 4) . The "symptoms" of this 
illness, as stated by the PDK committee were: 

(1) The Avoidance Symptom - Evalt,iation is perceived as a 
painful process which may expose a school districts' 
programs or individuals* shortcomings. Evaluation is 
avoided unless absolutely necessary. 

(2) The Anxiety Symptom ^ Evaluation evokes anxiety. The 
educator as well as the evaluator knows how cursory # 
inadequate, and svibject to error the evaluation process 
can be. The ambiguity in the evaluation process engenders 
anxiety in both the educator and evaluator. 

(3) The Inunobilization Symptom - Despite federal requirements 
to evaluate r evaluative data on educational programs, 
prnducts xnd procctlures are still rare. This lethargy 
and lack of responsiveness is symptonTiatic of deeper ills. 

(4) Lack ot Theory and Guidelines Symptom - There is a lack 

of unified theory of evaluation. With evaluators. differing 
omonq themselves about what evaluation should and should 
not bo, the evaluator in the field is left to his own 
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devices ror conducting evaluative inquiry; there are few 
useful guidelines for him to follow. 
(5) The Misadvice Symptom - There is ample evidence that 

<^valuation consultants have provid&d educational practi«- 
tioners with poor advice. Not only is there a lack of 
adequate guidelines but obtaining advice froia an evaluation 
"expert" is no guarantee that a technically sound evaluation 
report will result. 

And, to these were added the lack of trained personnels the lack of know- 
ledge about decision processes, the lack of values and criteria for judging 
evalluation results # the ^ need to have different evaluation approaches for different 
types of audiences, and the lack of techniques and mechanisms for organizing, 
procuring and reporting evaluative information. 

Tne foregoing suggest that at the beginning of the past decade the relatively 
new discipline of evaluation was indeed besieged with problems which could be 
conceptualized as deficiencies. These deficiencies, though, were themselves 
3^':rptomr> of a more fundamental ill: the lack of an adequate definition of 
evaluation and the lack of adequate evaluation theory. 
Traditional Definitions of Evaluation 

The lack of an adequate theoretical base for the discipline of evaluation 
hai5 often been cited as a factor vhich has stifled the development of the 
iield and its^^ability to provide meaningful evaluative data to educational 
practitioners. Even more problematic, h^owever, is the lack of consensus among 
^valuators as to how evaluation should be defined* 
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Evaluation has been arbitrarily defined in a number of ways. Four defini- 
tions which have achieved some popularity during the development of the field 
are the following. 

Evaluation as measurement . This early definition of e^^aluation came 
to the forefront during the 1920 's and 1930 's with the rise of the measure- 
ment movemen . in psychology and education. Evaluation received considerable 
impetus frr the emergence of the science of measurement and it is not sur** 
prising that the terms were equated during the 1930* s. More current measure- 
ment defiiutions have been expanded to give a. broader focus to the term 
evaluation but maintaining* the close tie to measurement. Consider the 
following definition from a measurement text by Thorndike and Hagen (1965, 
p. 27): * 

The term "evaluation"; as we use it is closely related to 
measurement. It is in some respects more inclusive, 
including informal and intuitive judgments. ... saying what 
is desirable and sgood. Good measurement techniqueis pro- 
vide the solid foundaition of sound evaluation,' 
Dofiuinq evaluation as measurement has the advantages of building directly 
on thti joiontific measurement movement with its attendant objectivity and 
I Lability. Further, measurement instruments yield data which are mathema- 
tically dnd statistically manipulatable, facilitating the establishment of 
norms and standards. The disadvantage of this definition of evaluation is 
Lhat It is totally dt^pendent on the development, administration^ scoring and 
intorprot.ation of meanurement inntrumonts (tests, questionnaires, attitude 
/,calob, i»Lc.) which take lime to devolop and are relatively expensive. This 
aiproaoh also obscures judgments and iudgment criteria. Scores become entities 

io 
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unto themsGlves v/hilQ concGpts behind the scores tend to be obfuscated. A 
final disadvantage, ^nd perhaps the most important, is that variables which 
do not lend themselves readily to measurement are often eliminated or ighqred. 
(See.Thorndike and Haqen, 1965, and Ebel, 1965, for further explication of \^ 
this approach to evaluation*) ' 

Evaluation as determining congruence . This widely accepted definition of 

evaluation is concerned with the congruence between performance and objectives, 
i.e., determining the degree to which the performances of students are congruent 
with the objectives of instruction. The major proponent of this* definition was 
Tyler who, reporting on his Eight Year Study of Progressive Education (Smith 
and Tyler, 1942) ,r viewed educational objectives as changes in behavior. If a 
program 3ucceeded in bringing about the desired changes (i.e., if there was a 
congruence between student performance and the objectives) then the program was 
judged successful. 

A major advantage of this approach is that it forces educators to concept- 
udli2t> clearly the goals of instruction and requires their full articulation. 
Further, this emphasis on objectives provides at least implicit criteria for 
judging the success of a program. Another distinct advantage of tnis 
d^-ifinition is that it allows for tho evaluation of education processes (e.g. 
te.;cher lH?hav.ior) as woll as educational products (e.g. student achievement). 

(.)ne di sadvantacfe of this definition includes the fact that objectives have to 
bo made specific to b^e moanurablo, which may obscure important but less specifiable 
objtn:tivt^s intended by proqrom developer r.. Another disadvantage is the heavy em- 
phasis pL^cod on stude nt behaviors. A new staffing policy or instructional strategy 
evaluated in terms of student achievement, and such issues as cost— effectiveness 

1j 
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teacher satisfaction and student discipiino may bu ignored. A related disadvantage 
of emphasizing student achievement is that congruence evaluations tend to be 

ex post facto . Although Tyler's approach allows for evaluation of process, 
the data emphasized in this approach, that of student performance, are available 
only at the end of the project when the performance of studer.ts is compared 
to program ob-jectives. Thus, valuable process data are often not collected 

(or at least not emphasized) and the opportunity for feedback and program 
modification is often lost. (See Tyler, 1950, and Purst, 1964, for a 
further discussion of this definition of evaluation.) 

Evaluatiol as pro]£essional judgment s The definitions discussed above place 
little emphasis on the judgmental process* Attaching value to the data was 
assumed* In this definition evaluation is professional judgment* The most 
common practice .in this approach is site visitation, such as that used in 
accrediting schools and colleges, A visiting team of experts come to "soak up" 
the environment I and to use their expertise in rendering a judgment of program 
effectiveness* 

Advantages of this approach include ease of implementation, consideration 

of a Large number of quantitative and qualitative variables (including the 

context, experience and expertise of the evaluators) and quick "turn around" 

« 

of results and conclusions. Major clisadvantages include the questionable 
objectivity and reliability of the judgments that are made, the cunbiguity of the 

raUqmt.^nt criteria, and the dif f icul ty in generalizing results of the evaluation 

vo othtn pro<irams or innt i tut ions, 

KvA i u.it ion AS a pplied r»^3earch . Although evaluation usually has not been 
d^T'ined in terms of ruse*'\rch, a sorting through of evaluation studies reveals 
d stron:| roiianco on tho :3cientific method and an even heavier emphasis on 



the experimeatal designs and statistical tools of research. This result is 
not surprising when considering that the typical evaluatqr is usually exten- 
slveiy trained in the methodology of research «md of wen only minimally trained 
in those concepts unique to evaluation. 

Despite obvious advantages of classical research methodology, such as 
experimental control over variables and the statistical power of parametric 
statistical techniques # there are practical considerations which limit the 
applicability of these procedures to educational problems. These were presented 
by Stuff lebeam et al., (1971) and are updated and summarized below with some 
extensions and modifications • 

1, Laboratory antisepsis. Coo ley' and Lohnes (1976) point 
out that scientific research attempts to validate the 
existence of cause-and-ef feet relationships with the 
ultimate goal being the development of a consistent and 
parsimonious theory of natural phenomena. Evaluation 
research, on the other hand, is concerned with means-end 
relationships with the ultimate goal being a rational 
choice between alternatives for action* Because scientific 
^ research pursues univ tarsal laws, knowledge must be obtained 

* *i 

in a context-independent way. Experimental manipulation 
is used to control all confounding and extraneous varia- 
bles. The evaluation of an educational program is concornedf 
however, with all the mi'::igating variables affecting some 
educational outcome. "In order to provide useful data^ 
. «ducational ova]uation does not need the antiseptic world 
of the laboratory, but the septic world of the classroom 
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and school" (Stuff lebaam, otal., 1971, p. 22), Laboratory 
research designs require conditions usually not attainable 

t » 

in evaluation contexts. 

Effects of intervention. In scientific research, variables 

are manipulated by the experimenter to create critical 

p «f 

comparisons o*f the ways variables interact. Thus, the 
experimenter's intents become part of the data* The e valuator, 

on the other hand, attempts to assess . interactions in a 
real rather than contrived environment. His data col- 
lection must be done unobtrusively so as to not confound 



his results. 
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Termiral availability of data. Research designs typically attempt 
to assess the effect of some experimental treatment. Th^ 
, treatment is administered, then data are collected and 
analyzed. Data for making judgments are available only 
after the treatment has been administered. This precludes 
the use of data to refine a treatment, although continuous 
refinement of an ongoing educational program is a frequent 
function of evaluation. 

■.Unqlo hroatmonts only. For purposes of experimental 
control, r,cientific research requires that a treatment 
be evaluated 'alone. It several treatments are operating 
simultaneously, their effects will confound each other. 
Educators, on the other hand, cannot withhold a poten- 
tially beneficial educational program because students 
are concurrently enrolled .in other treatments. 



Effects of, control variables. Random assignment is 
generally not possible in educational settings. Thus, 
to equate treatment groups (in order to enhance their 
comparability) evaluators usually match groups on 
selected control variables such as intelligence levels, 
ethnic mix, classroom size, socioeconomic stalas, and 
the like. The problem with this procedure is that 
criterion variables (such as measures of cognitive or 
affective achievement) are often correlated with these 
control variables causing treatnient differences to be 
obscured. 

Inapplicability of assumptions. Some assumptions 
underlying the use of parametric statistical proce- 
dures may not be met in the usual evaluation setting/ 
for exaraple when distributions are severely skewed ^ 
relationships nonlinear, or group variances 
unequal. 

Restricted decision rules • Conventional statistical 
techniques contain decision rules of the simple "go-no 
go*' variety, A null hypothesis may be rejected or 
accepted or treatment X may be judged better than 
treatment Y. Evaluators are often aisked to bring their 
expertise to bear in more complex decision settings. 

2o 
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In a similar fashion Hemphill (1%9) hd.^ distinguished research from 
evaluation along six dimensions; problem selection, ilreplicatiou, determina- 
tion of data, determination of hypotheses, values and control. To emphasize 
the differences between research and evaluation, Hemphill cast these 
dimenssions in parallel form. These dimensions are noted in Table 2. 



Table ?. 



Contrasts Between Research and Evaluation 



Research 



1. Problem > selection and definition 
is the responsibility of the indi- 
vidual doing the study. 

V 

2. ' Given the statement of the problem 

and the hypothesis, the study can 
be replicated. 

3. The data to be collected are 
determined larqely by the problem 
and hypothesis. 

4» Tentative answers may be derived 
by deduction from theories or by 
induction from an organxzed body 
of knowledge. 



Evaluation 



Many people may be involved in the 
definition of the problem and 
because of its complexity # it is 
difficult to define. • 
The study is unique to a situation 
and seldom can be replicated, 
even approximately. 
Tlie data to be collected are 
heavily influenced if not 
determinted by feasibility. 
Precise hypotheses usually cannot 
be generated; rather the task 
becomes ^one of testing generaliza- 
tions some of which may be basically 
contradictory. 



5. value judgments are limited to . Value judgments ai;e made explicit 
those impUcit in the selection by the selection and definition of 
of the problem. the problem as well as by the 

development and implementation of 
the study. 

6. Relevant variables can be ' Only superficial control of poten- 
manipulated or controlled by tially confounding variables can 

*■ including them in the design. bo achieved. 



While all-exclusive distinctions between research and evaluation are often 
subjects of controversy, most ^valuators and researchers implicitly support a 
broad separation between these two modes of inquiry. So sharply has the line 
between research and evaluation been drawn at times that some evaluators contend 
that the two modes of inquiry are basically incompatible and ultimately must 
employ different methodology. ^ 

Models for Evaluation 

Different conceptions of ev.iluation have spawned numerous, paradigms or 
models for implementing an evaluation study. These paradigms or models, however 
represunt different conceptions of evaluation more than they do different objec- 
tivf^s or contexts for evaluation. Matching a particular type of evaluation 

r 

problem to a particular model does not seem possible nor does there seem to be 
explicit rationale as to why an evaluator might choose one model over another. 
This has Left ovaluators without criteria for selecting the most appropriate 
model for a qiven evaluation problem* 

Some educators and proqram devolof>s operate under tlio assumption that a 
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variety of specific eyaiuatiou models exist which are readily applicable to 
their particul^ar educational problem. When the time for evaluation comes, 
the task is deemed as a simple one of selecting- an appropriate model,, plugging 
the program into it and analyzing the results. Evaluation models generally 
are not precise or specific and the choice of an evaluation model is itself 
a value judgment about how an educational program should be evaluated. An 
alternative to selecting a general model is to adapt a model developed for a 
specific setting and generalize it to one's bwn problem context. Highly specific, 
models are, however, developed within a narrow context and are generalizable 
only to those settings which have identical or hi^hly^imilar administrative 
organizations, fun4i.ng and political presses, personnel compositions, data analysis 
support\systems , client populations, educational objectives, and personnel .biases 
about-what is and is not important in evaluating a program. 

A variety of evaluation models abound in the professional literature. 
Some are purposively gene so as to be applicable to a variety of educational 
contexts (cf Hammond, 1973; Metfessel and Michael, 1967; Provus, 1971; Stake, 
I9G7; Stuff lebeam et al. , 1971) , while others are developed to meet evaluation 
needs in a specific setting (cf Belliott, 1969; Dykstra, 1968; Emrick, Sorenson, 
^tearns, 1973; and the Inter service Procedures for Instructional Systems Develop- 
ment (IPISD) model, 1975)^^ To underscore their general nature three popular 
evaluation models are summarized below* 

/ 

Thf^ Dis crepancy Evaluation Model 

This*' model, developed by Provus (1971), divides eva.luation into five stages* 
Stage I . This stage documentf; program lescription. The evaluator obtains 
trom the program staff a comprehensive description of program inputs, processes 
and outputs. These are compared to the staff's definition of the program. 

Discrepancies are noted and used to modify program definition such that it is' 

O 

congruent with program components. "i^ \j 
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. Stage 2 . in this stage field observations are used to determine if the 
program is being implemented as intended. Discrepancy information is used 
to modify program implementation* This is also called '^process" evaluation. 
stage 3 * In this stage it is determined whether program components are 
I engendering the attainment of intermediate or enabling educational objectives 

as intended^. It is a check on Whether student behavior is changing as expected, 

t 

Discrepancy information is used to modify either the program components or the 
objectives. This stage is similar to Scriven's (1967) concept of formative 
evaluation* ^ 



Stage 4 > In this stage it is determined whether program components are 
leading students to terminal program objectives • This stage often uses pre- 
post behavior change and sometimes control vs« experimental comparisons « This 
stage is similar to what is called summative evaluation. 

stage 5 > In this stage (which is not always applicable) the experimental 
program is compared to a realistic alternative. An experimental or quasi- 
experimental design (Campbell & Stanley ^ 1966) is used to prove that program 
benefit is commensurate with cost. 

This model's components include agreeing on program standards, determining - 
whether a discrepancy exists between aspects of a program and standards governing 
thos^^ aspects, and using discrepancy information to identify program weaknesses. 
Disxjrepancy informataon at each stage leads to a decision whether to proceed to 
the next stage or to alter either program standards or operations. Advancement 

to a subsequent stage is contingent on attaining congruence between operations 
and standards at the previous stage. If congruence is not possible prograin 
termination is recommended, although in practice this option is rarely chosen. 
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The .stake Model 

This model (Stake, 1967) divides educational programs into three major con- 

t 

COptS: 

Atitecedents - conditions existing prior to training that may be 
related to outcomes such as previous experience, interest and 
aptitude* 

2. Transactions - encounters of students with teacher, author with 
reader, parent with counselor or some educational activity such 
as the presentation of a film, a class discussion or working 

a homework problem. 

3. Outcomes - measures of the impact of instruction on students, 

* teacher, administrators, parents or others. These are usually 
■ measures of abilities, achievements, attitudes, aspirations, 
etc. outcomes can be immediate or long range, cognitive or 
affective, personal or community-wide. 
To Stake, evaluation involves (I) examining the logical contingencies 
that exist between intended antecedents, transactions and outcomes; (2) deter- 
mining the congruence between intended and observed antecedents, transactions 
and outcomes; and (3) determining the empirical contingencies between observed 
antecedents, transactions and outcomes. Illogical contingencies, lack of 
conuruencti, and possibly, a failure to establish empirical contingencies aid 
• til identifying program weaknesses. 

The CIPP Evaluati on Model 

This model, devolopod by tho Phi DoLta Kappa Commission on Evaluation 
(Stuf eiebeam, et al., 1971)# divides evaluation into tour distinct strategies 
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- Context evaluation, Input evaluation, Process evaluation and Product 

evaluation, thus the acronym CIPP. Context evaluation has as its objective 

► '• . ■ ■■ 

to specify the operational context and to identify problems underlying needs. 
Input evaluation is concerned with identifying and assessing system capabili- 
ties. The objective of process evaluation is to identify defects in procedural 
design or iniplementation and to document project activities. . The goal of 
product evaluation is to relate outcome information to objectives and to context 
input and process information. If those relations are not specifiable, program 
weaknesses are suspected. 

As can noted from this overview, evaluation models represent very general 
aids or hueristics to conceptualiziriy evaluation designs. Other more UM.:hriical 
models embodying more specificity have been developed for highly specialized, 
idiosyncratic applications but these have limited genoralizability across 
educational settings. 
M odels as Heuristics 

The desire among evaluators to identify models is understandable^ since 
there is the hope that once these models are established they can be 

used in a large variety of evaluation contexts. However, evaluation docs 

not work that way. The techniques and methods brought to bear in an evaluation 

,an>^a t:ur.;tion of the i)roblcm, the clients for whom the evaluation is being 
orviujtod and the amount of time and. money which can be devoted to Lt. While 
eviluation is certainly not an art form, evaluation models can communicate 
only a tt-latively small set of cateqories and constructs that might be useful 
in r>Ianninq an evaluation. Thuii, in part, the problem of choosing the correct 
fvaluui ion model dotives I'rom a .somewhat natural tendency to see an evaluation 
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modal d$ moro than it is ' as a mothudolociy tor actually conducting the evalua- 
tion instead of a mQta-mothodology or fremework into which must be plugged more 
specific constructs and methods. 

This false expectation for evaluation models has been known to lull 
educators into not giving much thought to the evaluation process. Further, 
when it is discovered that there is not a "tight'^ evaluation model available 
to provide needed evaluative data, it is often considered a shortcoming of 
the cvaluator. Good evaluation procedures require the input of evaluation 
specialists early on in program planning and development* The frustrations 
of educators over evaluation often stem from their own failure to consider 
evaluation issues throughout the educational program development process* Sur- 
prising to some is the fact that an evaluator is not an all^knowing guru with 
a magical bag of tricks (models) that will compensate for the failure to properly 
consider and plan evaluation activities early on in program planning and, develop- 
ment. Evaluation models do not provide answers but do provide useful guidelines or 
heuristics whichcan help organize thinking about how an evaluation should be con- 
ducted • This heuristic role for mo'dels, which has not always been appreciated 
ifA evaluation theory or practice, has been described by Kac (1969) : 
The main roio of models it> not so much to explain and to 
predict — thouqh ultimately these are the main functions of 
bcicnce-rds t.o polariso thinking and to pose sharp questions. 
Above all, they are fun to invent and to play with, and 
they have a pocuiiar life of their own. The "survival of 
the fittest" applies to models even more than It does to 
living creaturo.s. Thoy should not, however, bo allowed to 
multiply indiscriminately without real necessity or real 
purpose . 



A common approach to obtaining useful evaluation data has been to develop 
one's own model borrowing, where appropriate, from existing models in the 
literature, thus avoiding the model selection problem. This is generally the 
I^referred alternative for educators who wish to assure the best match of program 
purpose and context to an evaluation model. Table 3 indicates the fundamental 
evaluation concepts shared by these three models and where they tie into each 
model. Table 4 providiis a specification matrix indicating how each model 
addresses major considerations in chosing slti evaluation design. Taken together 

these tables provide a means of judging the applicability of the component parts 
of these models to specific evaluation contexts. Further « they serve to illus** 
trate the commonalities and distinctions encountered in studying established » 
well-known evaluation models. However, evaluators must be mindful that 
some solutions which are suitable for models may not apply to the real world* 
Evaluators must never forsake the real world for the complexity of models which 
purport to describe it. * r. . 
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TABLE 3 



Somo Coimonalities Among Three Evaluation Models* 



General Concept 



Provus 



Stuff Lebeam 



Stake 



Transaction 
enabling Behavior 
Input Evaluat^ iou 
Proiuct Evaluation 
Process Evaluation 
Pr'jqtam Definition 
Staridard?i 
Onject ives 
Judgment 
Cor»text 
Antecedents 



Transaction 
Enabling 
Stage 1 
Stage 4 

Installation Stage 



Instrumental 
Stage 2 
Stage 4 
Process Stage 



Program Definition Stage Input Stage 
Each Stage 
Program Definition 
Stages 1-5 

Context 



Transaction 

Immediate 

Antecedents 

Outcomes ^ 

Congruency 

Logical Contingency 

Relative I Absolute 

Intents 

After Description 
Antecedents 



♦Wor.lxnvj used by thes^ authors appears in their respective columns. 
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Character is ties of Three Evaiuatiou Models 



— . 


Characteristic ♦ 


CIPP 


Stake 


Provus 


1. 


Purposo of evaluation 


to make better « 


to describe and 


to uncover discre«- 






more defensible 


judge the merit 


pancies between 






decisions 


of a thing 


standards and per- 










formance 


i. 


Implied role of 


information pro- 


makes judgments 


compares stan- 




evaluator 


vider « serves 


about the effect 


dards with perfor- 






the decision- 


tiveness of a 


mance at various 




« 


maker 


program from 


stages to revise or 








descriptions & 


terminate program 








standards 




3. 


Relatioaship fco 


high 


high, " intents 


high "standards** 




oojectives 




are objectives 


are objectives 


4. 


Types of evaluation 


context, input 


description, 


program definition, 




activities proposed 


process, product 


judgment, logi- 


installation. 








cal & empirical 


process, product. 








contingency, 


cost/benefit 








congruency 




• 


Unique constructs 


•I'ontext 


logical contin- 


discrepancy 



gency 



•.;oru;t.ruct t5 m this column were selected from Worthen and Sanders (1973) 

3 o 
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TAULE 4 
(continued) 



Characteristic 


CIPP 


Stake 


Provus 


(1. Relationship to 


integral 


unclear 


high 


decision maker 








7* Some criteria for 


Did the evaluator 


Did the evaluator 


Did the evaluator 


* judging evaluations 


collect context 


look for logical 


collect data and 




input, process & 


contingencies & 


check for discre- 




product data? 


collect judgement 


pancies within 






data? 


each stage? 



8. Implications for 
evaluation designs 



mostly qualitative 
decisions except 
for product evalu- 
ation^ where a 
control qroup is 
applicable 



deals mostly with comparisons between 
descriptions and standards & per** 
judgments* Con- formance at each 
trol group help- stage are essential 

control group is 
needed for cost/ 



ful but not 
necessary, 



judgments can be benefit stage, 
absolute 
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III. Prospects for the ZnunediatG future: Emerging tronds in educational 
evaluation. • . • 

Since the initial c:s£A legislation of 1%5, some evaluators (Apple » 1974; 
Cooley & Lohnes, 1976; Kaufman, 1972; Cuba, 1978; Provus, 1971; Scriven, 1973; Stake 
1967, 1970; Stuff leoeam, et. al. , 1971) have attempted to provide a stronger 
basiss for evaluative theory and in so doing have implicitly or escplicitly 
offered new definitions and theoretical bases for evaluation. These new 
conc^iptualizations build on previous ones and can be broken into four types 
or styles of evaluation; decision-oriented evaluation, value-oriented 
evaluation, naturalistic evaluation and systems-oriented evaluation: 
Decision-oriented Evaluation 

The PDK National Study Committee on Evaluation (Stuff lebeanu et al. (1^71) 

.defines educational evaluation as: 

...the process of delineating, obtaining, and providing useful 
information for judging decision alternatives (p. 40) . 
Provus (1971) similarly defines evaluation as: 

primarily a comparison of program performance with 
expected or designed programs, and secondly, amongi other 
things, a comparison of client performance with expected 
client out'jomes (p. 12) . 
It can bo ?>een that theso decision-oriented definitions are heavily influenced 
by Tyltir's (19iio) conqruance definition of evaluation but are of a much broader 
scope and are oriente^l toward a decision tree logic. Inherent in this approach 
is an f^mphasis on comparing "what is'* with "what shouJd be" and using discrepancy 
data as a basi^ for decisions, 

do 
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Tho major advantage o£ this approach is that by following the models associated 
with these definitions an evaluator is better able to provide the. kinds of 
information desired by decision-makers. Acceptance of the decision-oriented stance 
requires that clearly defined goals and objectives be elucidated prior to the 
collection of data, thus ensuring the presence of adequate criteria for judging 
the adequacy or relative merit of a program. . The presence of prespecified 
criteria for judgit)ig program effectiveness may, however, be a disadvantage 

the following di^scussion notes. 
Value-oriented Evaluation 

Some authors in the field of evaluation have taken exception to the notion 
of decision-oriented evaluation and its implications for the conduct of evalua- 
tive research* The primary criticism of decision-oriented definitions is that 
evaluation is viewed as a shared function* The role of the evaluator is to 
provide a decision maker with meaningful information; the decision maker makes 
the actual judgment of value or merit. 

A value-oriented definition of evaluation stresses the value- judgments made 
in evaluating educational programs and describes the act of judging merit or 
worth as central to the role of the evaluator. Worthen and Sanders (1973) define 
evaluation as "...the determination of the worth of a thing." (p. 19). Scriven 
11967) considers the evaluator who does not participate in the decision making 
process as having abrogated his role. Stuff lebeam et al., (1971) and Stake 
(1967) argue that by participating in decision making^ the evaluator loses his 
obiectxvtty and hence, his utility. Differences in these approaches are more than 
.•semantic tor they imply different evaluation activities. 
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Within the decision-oriented approach, the evaluator is dependent upon the 
decision maJcer for the way the.decision context is to be defined and for the 
values and criteria that are to be used to judge program suc^^ess (these are 
usually termed program intents, goals, or purposes) . Cooley and Lohnes (1976) 
and Apple (1974) point out that there is no evidence to suggest that the deci- 
sion maker is any more capable than . the evaluator to define decision settings, 
alternatives, and values. Indeed there may be (and often are) social, institu- 
tional, and political presses on the decision maJcer which may lead him to opt 
for evaluation procedures that slfirt or ignore key evaluation issues. Apple 

(1974) makes the case that decision-oriented evaluation is a conservative practice 
not conducive to the acceptance of educational innovation but rather supportive 
of the status quo, Apple point is that the limits of the decision-oriented 
evaluators work is circumscribed Ivirgely by the already developed program, and- 
therefore, the evaluator canxxot deal with the issues, concerns and objectives which 
predate the program and to which the program is supposed to be responding. Once 
the program is in place, the evaluator 's role is to work with it (i.e. revise 
or modify it) regardless of whether it is the best means to the desired end, 

Scriven (1974) argues that value judgments are a crucial part of all sciences, 
partiijularly methodological value judgments, and there is no reason to dismiss 
them in evaluation. He ':alis for goal-free evaluation, insisting that all aspects 
of iin educational program should como under the scrutiny of the evaluator and 
that nothing should be taken as given from the client or agency soliciting 
evaluation expertise. The following illustrates his point: 
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The goal-free evaluator is a hunter out alone and goes 
over the qround very carefullyi looking for signs of 
any kind of game, setting speculative snares when in 
doubt. The goal-based evaluator, given a map that, 
supposedly, shows the main game trails, finds it hard 
to work quite so hard in the rest of the jungle, 

(Scriven, 1973, p. 327) 
scriven argues that while knowledge of, goals is necessary for effective 
planning and implementation it is unnecessary in evaluation and may even blind 
the evaluator to important program effects. 

Scriven (1973, 1974) and Apple (1974) also emphasized the social responsi- 
bility of the evaluator. "^^ Scriven offers the hypothetical example of an 
educational program aimed at increasing self-sufficiency. After some evaluative 
activity the evaluator discovers that in addition to fostering self-sufficiency, 
the program engenders contempt for the weak, sick, old and congtntially deformed. 
Scriven contends that those findings should count against the program although 
the program developer might be concerned only with the achievement of his 
announc:ed and intended goal. The welfare of the consumer (usually in the case 
of education, society as a whole) is considered a proper concern of the evaluator. 
Apple puts forth a similar argument t 

The tendency in the face of the all-too-usual finding of 
"no significant, difference" is to argue for better teacher 
training, for better instructional materials, for more 
sophisticated admitiutrative systems designs and the like. 
However, it may well ha that more basic questions must be 

* 
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a$ked, that even the obligatory nature ot* the institution 
of schooling mciy need questioninq. ot that educators aro 
asking the wrong kinds of questions. . .much low achievoment ^ 
on the i>art of thu students could bo attributable to a , 
symbolic dismissal of^ school itself as a meaningful insti-* 
tut ion. . .unresponsive to human sentiments. . .Educational ' ' 
problems are considerably more fundamental than educators 
n\ay suppose # and it places responsibility on the individual 
educator to examine his or her own professional activity 
in a wider social and political context. (Apple, 1974, pp. 28*29) 
The implication of Applets view for the evaluator has been elucidated by 
Decker (1974), a sociologist, who forshadows how the evaluator who fails to give 
deference to the status quo is li)cely to be received by the decision maker: 

For a great variety of reasons, well-known to sociologists, insti- 
tutions^ are refractory. They do not perform as society would like 
them to. Hospitals do not cure people; prisons do not rehabili- 
tate prisoners; schools do not educate students. , Since th6y are 
supposed' to, officials develop ways of denying the failure of 
the institution to perform as it should and explaining those 
failures which cannot be hidden. An account of an institution's 
operation from the point of view of subordinates therefore 
casts doubt on the official line and may possibly expose it as 
a lie (Beckor, p. 11*)). 

becker believes that any approach the sociologist or evaluator might take 

- ■• 

i;i inherently value ladden and will implicitly support either the subordinate 
(proqram participants') or superordinato (program manager's) point of view: While 
this may be true, Becker *s comment also raises the possibility that due to the 
efforts of docisioi makers to protect the status quo or allow only changes 
to bo made that are conarviont wxth the existing social, political and organiza- 

3j 
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tional structure, evaXuators may implicitly designing evaluations that 

CJHiimini- only iihe otTicicy oT LIk> pruqi ^'^t**^**'!^^^^^'* P»irticii>%iul • » 

poln^ of vi«w, Avoidincj all othur point ; of viww* JJurU a dusiqn ija rnniit likely 

when the vjoals and ot)jk2wtiv€:fc* fwt a pio^ram must ho takon as "cjivQua" eind the 

evaluation designed around them. 

Dewoy's Conceptualization of Valuation 

A cohesive value-orientiad thporeticaT perspective on evaluation has recently 
been put forth by Cooley and Lohnes (1976) . Their stance is based on the early 
work of John Dewey (Dewey, 1922, 1939) and borrow^ from Handy 's work on the 
study of values in the behavioral sciences (Handy, 19C9, 1970; Handy & Kurtz, 
1964). While the propositions of Cooley and Lohnes' theory of valuation are 
quite similar to and generally subsume those of Apple, Scriven, Worthen and Sanders 
and others, they are put forth in a more direct fashibn that have practical ^ 
implications for some additions to evaluation methodolocfy. 

They assert that the value statements inherent in educational programs can 
themselves "...be analyzed into a sot of propositions subjectable to empirical 
investigation and that failure to perform such analyses in evaluation studies 
is inexcusable" (Cooley and Lohnes, 1976, pp. 9-10). They argue that the 
values which have guided educational practice have traditionally been determined 

by politics and custom and that their vaLidity has not been challenged by edu- 
catii^Hdl reauarnhors. They find it curious that value propositions have ovaded 
empiricMl .scrutiny despite edu<rati<)nai rei;oarcheM> • henvy emphasis on empiricism. 
Cli^r thinking about values in education is consideresd essential because edu- 
cational practice is-qencially influoncod by the value attached to desired edu- 
cational qoals. The alternative to rational inquiry into values is the deter- 
mination of values on the basis oi power which places the educationcil enterprise 
".,.at the mercy of i5pMcial interest qroups who commend values favorable to 
th€^m:;elves as universals" (p. 10). J . 
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A basic premise of Dewey *s notions about values and valuations was that 
values could be mistakenly viewed as absolutes only if they were considered out 
of context. When conside d ii\ context s values lend themselves to elucidation 
as propositions about real entities (matter of fact) and the error of ascribing 
to them absolute or universal properties is thus avoided. The task of the 
evaiuator becomes one of ascertaining whether ^^Iralue propositions inherent in an 
educational setting reflect only corivention or tradition or whether they imply 
empirically testable relationships between educational means and ends* 

Consider the hypothetical example in which an evaiuator is called in to 
determine whether an inservice training program for teachers would increase the 
teachers* appreciation of the difficulties encountered by Spanish-speaking 
children in a predominantly English-speaking community. 5*he foregoing discussion 
suggests that the evaiuator should considar the content before preceding. Did ^ 
school administrators merely assume that a general inservice program would have 
thi3 effect? Was pressure applied to administrators to improve teacher under- 
standing of cultural differences? Was the program developed just because funds 
wore available? Or hem > it was politically expedient for an elected school 
official? Or was it beca ise a survey of teachers, parents, and students 
indicated that such an inservice program would be beneficial? The latter possi- 
bility is desirous but seldom rncountered» 

The value judgment explicit in the above example is that teachers need to 
have a better appreciation of the educational difficulties encountered by 
Spanish-speaking children. Also implicit is that the teachers presently are 
insensitive to these problems, that these students; are being shortchanged in 
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their education and that the administration is quite concerned over this state 
of affairs. Each of these value propositions may or may. not be true and is 
capable of being empirically determined. 

Optimally, the need for such a progreun would be ascertained before it is 
developed and implemented. However, this is not always done. " Evaluators are 
usually ignored in progir'am pianning.>3 development , and often in implementation. 
This = greatly limits the evaluation expertise that could be brought to bear in 
the educational setting. Evaluation has much to offer in terms of the "front 
end" work of educational programming and significant inroads have been made in 
the area of needs assessment (see Kaufman, 1972, 1976V 1977). This issue will 
be discussed subsequently, but let it suffice to say here that the notions of 
Dewey (1939), particularly as they are elucidated by Cooley and Iiohnes (1976) , 
provide th • retical justification for the involvement of evaluators early on in 
an educational endeavor generally, and for the conduct of needs assessments 
particularly. 

, Another sign: f icant aspect of Dowey's theory of valuation is that he made 
no absolute distinction between means and ends. Any educational event or condi- 
tion (e.g., a particular teaching strategy, student achievement in a particular 
area, etc.) can be viewed as occupying space on a continuum such that it is 
simultaneously an end to those events and conditions that preceded it and a 
means to those that follow. For example, inservice education is a means to 
improved teacher performance which in turn is a moans to successful educational 
nettings, etc. Dewey (19^2) makes the further assertion that it is only when an 
end is conceptualized as a means is it fully understood, appreciated or even 
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obtainable. 

To some extent evaluators have taken means-end rolationships Into 
account by dividing outcomes into enabling # those that are prerequisite to 
the attainment of terminal outcomes and terminal outcomes, those that are 
expected at program i^ompletion. Provus (1971) carried the means-end continuum 
one step further by articulating th^ concept of ultimate outcomes ♦ those that are 
expected sometime after program completion. For the evaluator following Provus * 
model I terminal outcomes are also enabling in that they, too, become means to 
still other, ultimate ends* Cooley and Lohnes (1976) have argued that there 
can be no ultimate outcomes unless one appeals to some higher order good. 
Hoban (1977) suggests that these higher order ends might be chosen from among 
the values shared by our society such as affection, enlightenment, rectitude, 
respect, skill, i>ower, wealth and well-being, concepts with which a philosopher not 
an evaluator would be comfortable. Yet, it would be admirable for the 
evaluator to make explicit the means-end relationship which is implicit in 
every evaluation setting, testing its logic and direction against some acknow- 
ledged higher-order good at least one step up on the means-end continuxam. 

The immediate problem for the evaluator is one of determining where to 
broak into the means-*end chain for purposes of data collection. Infinite regx'ess 
iLi [x?ssibLe in either direction* Cooley and Lohnes suggest that focusing on 
the present resolves the dilemma. By striving to endow "...present educational 
policies with a more unified meaning" {p. 13), the evaluator establishes for 
himself a bounded context for his evaluative activities. This context cannot, 
howovor, L€? too r eatzict ivo . Judging tht> r^l^^tive value of several competing 

for exaitvple, i^y vf?ry much influc^nced by each aLtornative's role as a means 
to sul)sequent ends and it is through this Icqic that relative judgments of worth 
can be made. 
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The question is always what kind of world we want. 
It is .never the narrow one of how to maximize some 
fixed type of qain..,The very important principle 
is that clarification and transformation of aims or 
goals of education will be a result of, not a pre- 
requisite for, evaluation research. (Cooley and Lohnes# p. 14) 

This approach ^to evaluation also e ^siaes that both means and ends are 
subject to judgments of value. This posii^^on is similar to Scriven'a (1967) 
concepts of formative and summative evaluation. Because the differences between 
means and ends are seen as superficial, this theory of evaluation poses no 
restrictions on the evaluation activities that may be pursued in either the • 
formative or summative mode and argues that both modes be utilized. 

Another relevant point made by Cooloy and Lohnes is that evaluation should 
not be conceptualized as a single product . (usually a monograph) delivered at the 
conclusion of an evaluation. Rather, it should be viewed as a process in which 
the evaluator interacts with all other interested parties for an extended period 
o^* time. This allows for resolution of differences in opinion, viewpoint, 
and interests. Cooloy and Lohnes consider this version of educational evaluation 

"..^a process of conflict resolution through intelligent social deliberation'* 
(p. 16) . This approach suggests an interactive mode which allows the emergence 
ot' a common conceptualization of tho educational program among all involved 
parties and fosters a consensus of program need, design, implementation, and 
eva luat ion. 

This version of evaluation also stresses the education of all persons 
mvolvod in ^>valuation. Put simply, tho evaluation should be a learning experi- 
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enao for all involved. It is not unn^fllistic to expect that the various 
parties ♦-o aa educational endeavor should come to understand more precisely 
what they are trying to do and why, how their educational programs achieve 
the I -sultss they do, and how each participant may individually facilitate the 
attainment of successively higher level educational ends in a meaningful way. 
ThuS/ evaluation may be viewed as an educational procedure (or means) itself 
which has as its potential ends-in-vicw a more harmonious, pleasant and 
effective educational setting. 

A final point about the value-oriented approach to evaluation is that it 
s inherently humanistic. Educators who consider themselves in the humanistic 
camp would be attracted to the value-oriented approach because it focuses. on 
the total effects of a program and short and long range outcomes as part of 
d. larger means-end continuun\. Also, emphasis is placed on the empirical 
validation of goals and values, thus preventing them from being determined 
arbitrarily. The conceptualization o£ a means-ends continuum provides a 
foresight ful vision of ultimate program effects. The goal free bias inherent 
in the approach nrovidcis a rationale for being sensitive to unknown or unin- 
tended program effects. This theoretical view of evaluation has the potential 
for breaking the traditional mental sut of evaluation and provides evaluators 
with a framowork for providinq inforn\ation which can be used to reduce undesira 
t>Lo vjonciitiony in socioty, such au illiteracy, anomie, crime, and racial 
jUito, when Uvj rinu"! L i ocat ion of t)io,iio conditions is stated as a higher order 
vnd . Cooley and I,oant:.'3 statf: 



45 



What has beon missing in controvnrsies over the schools 
is convincing evidence which rolatea choices of oducational 
practices to ends which society values, ends which satisfy 
needs. Genorating such evidence is what evaluvition is all 
about, (p. lu) 

Cooley and Lohnes (1976) rediscovery and updating of Dewey's principles cf 
valuation represents a significant addition to evaluation. Primarily, it provides 
logical and theoretical justification for evaluation concepts, designs, and 
activities recently called for by other authors in the field (Scriven, 1967, 
1973, 1974; Apple, 1974; Worthen and Sanders, 1973; Kaufman, 1972, 1977; Borich, 
.1977). This justification has been sorely lacking. Unguided by a prudential 
theoretical basis, evaluation has moved in directions not always conducive to 
the ultimate improvement of educational quality. Secondly, this theoretical 
perspective has practical implications for the ways evaluation should be conducted 

which are in some ways at variance with traditional approaches. Inherent in 
thvs perspective is a call for new methods and new concepts in the field of 
evaluation leading to a considerably expanded and more flexible role for the 
evaluator. Lastly, this approach is ultimately concerned with evaluators' 
responsibility for "doing the right thing" in terms of educational planning 
and programming and offers a perspective for moving in that direction. This 
hiqh.-r order orientation has not always been present in evaluation theory or 
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Naturalistic Evaluation* 

One of the ..ow methods and new con-..ipts called for by Cooley and Lohnes* 
updating of Dewey's theory of valuation is that of naturalistic evaluation. 
An outgrowth of ecological psychology (Backer, 1965, 1968) , naturalistic inquiry 
stands in contradistinction to the more formal models of evaluation pteviously 
discussed. Naturalistic evaluation has been referred to as an alternative 
to conventional evaluation methodology, breaking ties to both traditional 
forms of instrumentation and traditional methods of data analysis* 

While many definitions of naturalistic inquiry have been proffered, Guba 
(1978) has suggested that naturalistic inquiry differs from other modes of 
evaluation by its relative position along two dimensions: (a) the degree to 
which the investigator manipulates conditions antecedent to the inquiry, and 
(b) the degree of constraint imposed on the behavior of subjects involved in 
the inquiry. Accordingly, naturalistic inquiry has been defined as 

* 

...any form of research that aims at discovery and 
verification through observation. (Willems and Rauch, 
(1969, p. 81) 

. . . slict?-of-life episodes documented through natural 
language representing as closely as possible how people 
• feel, what thny know, how they know it, and whit their 

Concerns, beliefs, perceptions and understandings are. 
(Wolf and TymiL/, 197b-r)77) 

♦I MT. LVi»>btod to Guba (r)7S) for much of the material upon which this section is 
bas*'^d» Readers aro directed to Idrs work for more on this topic* 
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...evaluation which attempts to arrive at naturalistic 
ijenerali^ations on the part o£ the audience; which is aimed 
at .non«-tQchnical audiences like teachers or the public at 
larqe; which uses ordinary language; which is based on in-* 
formal everyday reasoning; and which makes extensive use of 
arguments which attempt to establish the structure of 
ireality. (House, 1977, p. 37) 
In addition naturalistic studies have been identified by Sechrest as ones which 
ik) do not require the cooperation of the subject 

(b) do not permit the subject's awareness that he is being 
measured or treated in any special way, and 

(c) do not change the phenomenon being measured. 
(Willems and Rauch, 1969, p. 152) 

In theory, a naturalistic study consists of a series of observations that 
are, alternately, directed at discovery and verification. This process supposedly 
leads to successive reorientations on the part of the investigator toward the 
phenomena being observed and to further discovery. 

Unlike formal evaluation models, the naturalistic evaluator approaches data 
collection (observation) with a minimum of preconceived categories or notions 
of what will be seen, a? though the behavioral phenomena were being observed for 
the first time. Any effort to manipulate any part of the program prior to ob- 
servation or to constrain the behavior of those being observed would reduce 
the ''naturalism*' of the method. How data are tabulated and analyzed in 
a naturalistic study is left up to the investigator and no "best" method is 
identified, although it invariably includes some form of unstructured observation 
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toUowed by a piucincj together of relationships, patterns or consistencies in 
cho data whica ..re used further c. ...ol and focus subsequent observations. 
Data recording methods may include impressionistic accounts or ethnographic 
records* of the phenomenon observed. From these accounts more structured cate- 
gories of behavior are derived, which then are expanded and verified through 
still further observation. 

t 

Naturalistic inquiry is appropriately considered by its proponents a tool, 
technique or method for viewing behavior and not exclusively a mode of evaluation, 
Thus, as a general methodology - or perhaps meta-mothodology - its basic tenets 
would appear compatible with other forms or stages of evaluation which do not 
classify as experiments, i.e. where conditions are not, prearranged and subject 
responses not constrained by the activities of the evaluator, e.g. goal-free 
evaluation. Naturalistic inquiry need not be considered an all-exclusive alter- 
native to conventional models of evaluation when these other forms of 
inquiry do not unduly constrain the "naturalism" of the inquiry. Conducive to 
-his line of reasoning is the idea that "naturalism" is always considered a 
matter of degree, making trade-offs and multiple approaches to evaluation possi- 
ble. This view, however, receives little attention in the literature on natura- 
l i«;r ir inquiry. 

The extent and manner to which naLuraii;itic inquiry has become inculcated 
Hi the present flay thinking of evaluators is of considerable interest. The 
uitluence of natural isitic inquiry in this regard has been significant and repre- 
■i'Mi^s what mi>^ht br dcscrib.-ii as tho utid-'i' ly uiovemont away from conventional 



* TTvt"' ' obsor vat lona I record associated with the fivUd of anthropology in which 
LV'havior is recorded iw relation to thr- context in which it occurs and is ascribed 
poatung only in rolation to this context. 
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evaluation modQls and more formaiisitic dQfinitions of evaluation^ namely the 
maasuroment^ congruency a»id applied research definitionSt Oddly enough, it 
is a return of sorts to the visitation type definition of evaluation, rejected 
by many evaluators a decade ago for being too subjective and impressionistic and 
represents in spirit, if not method, the value-oriented approach to evaluatioxi. 
Value-oriented writers such as Dewe:, , Scriven and Apple'would find solace in the 
fact that naturalistic inquiry, more than most other methodological perspectives, 
is likely to yield data unconstrained by preconceived notions about what the pro- 
gram is or is not supposed to do. This perspective seems congenial to the discovery 
of means-end relationships (Dewey), side effects and unanticipated program 
outcomes (Scriven) and fundamental issues which question the very rationale 
upon which a program is based (Apple) . 

While not embracing natxiralistic inquiry directly, some evaluators have 
turned to this approach as a result of what are perceived to be; serious limi- 
tations to convenLiondl evaluation methods, namely: (a) that, conventional models 
have been too restrictive in the types of data that can be observed and therefore 
may be insensitive to unique and unexpected program outcomes, (b) that conven- 
tional evaluation may at times actually contrive data by manipulating dimensions 
.of a program which have no practical value in the real world and (c) that 
convt.'ntional modes of evaluation, particularly those ascribed to either the 
moant^urement or congruoncy definitions of evaluation, may actually constrain 
through formal instrumentation the responses expected of subjects. In response 
to those Limitations several evaluators have developed "alternative models" 
or approaches to evaiucjtion whirh ombody the elements of naturalistic inquiry. 
Thosp models da not dopend upon tho arrangement of antecedent conditions or 
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constraint of subject response: hence, the basic conditions for naturalistic 
Inquiry aro mat. Those models, taken from Cuba (1978) , are reviewed briefly 
below. (For further explication of naturalistic inquiry see Cuba, 1978, and 

Willoms and Rauch, 1969 J 

The Responsive Model . The first model with some relationship to naturalistic 
inquiry is the responsive model developed by Stake (1975, a,b) . The responsive 
model focuses on important issues and concerns pertaining to a program • 
According to Stake, evaluation is responsive if it: 

orients more directly to program activities than to 
program intents; responds to audience requirements 
for information; and if the different value perspectives 
are referred to in reporting the success and failure of 
the program. (Stake, 1975, p. 14) 
rho primary purpose of responsive evaluation is to respond to audience requirements 
for information and to bring. to the foreground different value perspectives that 
miqht be held by different audiencots. Its methodology, like naturalistic inquiry 
itself, is nonconstraining. Stake describes it in the following terms: 
To do a responsive evaluation, the evaluator conceives 
of a plan of observations and negotiations. He arranges 
. for varioua persons to observe the program and with their 
help prepares brief narratives, portrayals, product dis- 
plays, graphs, f'tc. llo finds out what is df value to the 
audioncos rmd qathtsrs oxprossions . of worth from various 
uidividuals whose point, of viow differ. Of course, he 
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chocks tha qudility of his records J he gets program per- 
sonuol to react to the accuracy of the portrayals; and 
audience members to react to the relevance of his findings* 
He does much of this informally - iterating and keeping 
a record of action and reaction. He chooses media acces- 
sible to his audiences to increase the likelihood and 
fidelity of communication. He might prepare a final written 
reports he might not,,- depending on what he and his clients 
have agreed on. (Cuba, 1978, pp. 34-35) 

These wictivities are carried out in a series of steps which may be described as 
(a) talking with clients, program staff and audiences# (b) identifying program 
^cope, (c) providing an overview of program activities, (d) discovering purposes 
and concerns, (e) conceptualizing issues and problems # (f) identifying data needs 
relevant to t-he issues, (g) selecting observers and' instruments (if any), (h) 
oDservinq designated antecedents, transactions and outcomes^ (i) thematizing 
or preparing portrayals in case studies, (j) winnowing, matching issues to 
audiences, (k) formating for audience use, and (1) assembling formal reports 
( if any) . 

The ludij^^Lai Model > A second evaluation model with some relationship to 
n.\nual ir>t.ic inquiry is tho judicial mcxlol . *^ Developed by Wolf (1975), Owens 
U'^^M and Levin^a (r374), tho judicial model is patterned after the administra- 
tive hearinq in a court of law. Tne purpose of the judicial model is to 
fluninate, inform and <idjudicato issut^r; related to the object or activity being 
t^valuatiid, Adv(^':atos or counsels take opposite views with respect to an* issue 
a;.i arcjue as convincinqiy as possible their side of the issue. Jury and judge 

0^ 
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hear testimony from "witnesses" and. tho presentation of facto regarding the 
issme* then of for their opinion as to the meti'. or worth of the program and 
their reconunendations for impro^tnent. Like the judicial process itself » 
this approach to evaluation assumes that "truth" is more likely to emerge in 
an adversary setting with two evaluators "pitted" against one another than in 
the case of a single evaluator using conventional evaluation models and data 
collection methods. 

Generally, the following steps aru employed in the judicial model: 

(1> Issue generation. The issues are identified through "fact- 
finding interviews" with samples of the audiences involved, 
as in the case of the Stake responsive model • 

(2) Issue selection. The purpose of this state is to delimit 
the number of issues and to prioritize them, so that they 
may be manageable in a hearing formats 

(3) Preparation of formal arguments. Each counsel or advocate 
team prepares formal arguments related to the selected 
issues. Available evaluation or other data may be used 
(to bo introduced as exhibits" in the hearing stage) # and 
additional evidence may be collected, particularly evidence 
in tho form of depositions from witiesses-^ Additionally/ 
Lit.^iecttMi witnoHsos may be asked to give testimony at the 
hciirintj it.soir. ^ 

(4) Pro-hiiaring discovery sessions. Each advocate team reviews 
the major argumortts it intends to make and discloses the 
main f(^atures of its "evidence" for the other. Since the 
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hearimj is not a **trial** in the conventional sense # but an 
effort to dotormine ''truth" as precisely as possible^ each 
side shares its findings with the other so that the hearing 
may be as comprehensive as possible. In addition^ the advo-* 
cate teams decide on g>*ound rules r e.g.^ number of witnesses 
to be called and criteria for determining admissability of 
evidence. 

■» • 
(5) The hearing. Modeled on an actual courtroom process J^he 

hearing involves an administrative officer and a "jury" 

or hearing panel. After hearing the evidence the jury 

carries out whatever tasks the, advocate teams previously 

agreed to assign to it, which usually involves at least 

the determination of findings {which may include Judgments 

of worth) and the making of selected recommendations. 

(Gui?a, 1078, p. 36-37.) 

rh e Trdnsa(^tional Model . A third evaluation model with some relationship 

to naturalistic inquiry is the transactional model described by Rippey (1973) • 

rhia madoi supposedly differs from convontional ^.models in that it deals directly 

wir.h maiuKiemtnt conflictij and inrit i Lut i onal change brouq-it about by the imple- 

nu.»ntatii)n of a {jroqram, utilising what its authors call "open systems theory." 

I rar.sact lonal evaluation studi.k3S institutional disruptions brought about by 

the program and work$> to ameliorate those disruotions through strategics for 

vv:nflijt manaqement. 

Pransactiorial evriluation has fivo phases: 

(I) The i'utial phase Pre-t-^xisting unrest or some other 

troublejomo situation exists, A meeting is set up of 



interested parties under the direction of a "neutral" 
evaluatpc working' in a non- judgmental at ffiosphere. 
(2) Instrumentation phase. Dazing this phase, a "Trans- 
actional Evaluation Instrument" (TEI) is developed whose 
purpose is to provide the. evaluator with insight into 
the perceptions and expectations of various interest 
groups* The instrument also provides a forum for the 
sharing of opinions among the groups. The TEI is 
developed and administered in group sessions, during 
which (a) the evaluator. initially formulates issues on 
the basis of general expressions from the group, (b) 
participants are asked to re-express opinions about 
them, (c) the most representative and divergent of the 
written responses are carefully worded into items that 
-an be rated on a scale from "strongly agree" to "strongly 
disagree," (d) the instrument is administered to the group, 
and (o) responses are examined. 

(3) Program development. The program is redefined to reflect 
those goals and values on which the group can achieve 
some consensus. 

(4j Proqram monitoring. Various groups agree to assume 
roiiponsihi 1 1 ty for implom(intinq and monitoring the 
dove Loped [urogram. 

(M Rouvcliruj. A^i now .;ont'lir;tr> emerge, the entire process 
lii ri'r'/fhnl fo whatf'Vi^r phauo is api)ropriate . 
(■.;uba, 1')7h, p. 2B; TaLmadgo, 1975) 
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The Connoisseur ship Model . A fourth model with some relationship to 

natur-Ali.stic inquiry is the connoisseurship model developed by Eisner (1975) • 

I* 

Thi« app::oach views educational evaluation as a form of criticism. In Eisner's 
view, criticism depends upon connoisseurship - or the private act of apprecia- 
ting and sensing the subtle qualities of an object or activity. "Critical 
guideposts" (ised to conduct the evaluation are essential elements of the con- 
noisseurship approach. These guideposts represent the personal values and 
concepts formed from tradition, experience and theories about the standards for 
judging the object or activity. Cuba (1978) characterizes connoisseurs as: 
persons with refined perceptual* apparatus, knowledge of 
what to look for, and a backlog of previous, relevant 
experieneo. They have the ability to recognise skills, 
form, and imagination and to perceive the intentions and 
leading conceptions underlying the entity being evaluated. 
In effect, because of these characteristics, the con- 
noisseur is himself the evaluation instrument. Having 
• made his judgments, he communicates the qualities that 
constitute the entity being evaluated, its significance, 
and the quality of experience engendered by interaction 
with it, otton throuqh the use of rich metaphors, (p. 39) 
rtv> Illaminatioii Moaol. T^erhaps mcst similar to naturalistic inquiry is the 
ill'.^Lnation model develoiuul by Parlott and Hamilton (1977), This approach to 
»^v.^ luation relies heavily on open ended observations (but also questionnaires, 
L:Ut>r views and t.osts) to continuously record ongoing events in order to (a) 
•lU^^ntify critic-al arid nonobvious <;liaructcr ist ics oE a program, (b) the tacit 



s 



56 



assump.tions underiyinq it, (c) . interpersonal relationships affecting it, and 
(a) oompicsx realities surrounding the program. In the author s-' words* 

illuminative ovaluation, takes account of the wider contexts in 
which education proqrams function* Its primary concern is 
with description and interpretation rather than measurement and 
prediction. It stands unambiguously within the alternative 
methodological paradigm. The aims of illuminative evaluation 
are to study the innovatory progrcun: how it operates; how it 
is i/\fluenced by the various school situations in which it is r 
applied; what those directly concerned regard as its advantages 
and disadvantages; and how students* intellectual tasks and 
academic experiences are most affected. It aims to discover 
and docxiraent what it is like to be participating in the scheme, 
whether as teacher or pupil, and, in addition, to discern and 
discuss the innovation's most significant features, recurrent 
concomitants; and critical processes. In short, it seeks to 
address and to illuminate a complex array of questions, 
(••-luba, 1^)78, p. 40) 
1 1 liiminativo evaluation is oarrit?d out in three stages: 
\i) initial obsf^rvations for two purpose of familiarization with 
a.;/-to-day reality of tho :->ett inq (Ji) , largely in the 
manner of social anthropolocji ts or natural historians; 
l.!^ more suotained and intnnsivo inquiry into a number of common 
incidc^nts, r^cuirinq ^r<»n(^!■i, \nd i.^suos frequently raised in 

o 
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(i) efforts to seek qeneral principles underlying the organiza- 
tion of the program, determine patterns of cause effect 
within its operation^ and place individual findings within a 
broader explanatory context, (GubaV 1978, p. 40)- 
Summary of Naturalistic Models 

All five of the models presented qualify as naturalistic in that they adhere 
to the two primary conditions set forth by proponents of the naturalistic method: 
(a) they do not manipulate conditions antecedent to the inquiry and (b) they 
pose ^^minimal constraints on the behavior of participants involved in the inquiry. 
While always a matter of degree, these five models meet these conditions to 
a qreater extent than do most conventional approaches to evaluation* 

However, it can also be noted that the five models are somewhat vague as 
to the precise manner in which observations are to be conducted and the data 
' resulting from them converted into meaningful statements which seirve some 
client group. Conspicuously lacking both in summary and original documents 
viescribing these models are descriptions of the processes by which responsive 
iudicial, transactional, connoiseurship, and illuminatory accounts of behavioral 
phonomencjn arc q leaned of their most pregnant content and communicated to 
auilirnu'<^s who 'l^.-siro aaswcrri to ripccifio questions/ some of which may have been 
r.i3hi >ni?d prior Lo program obsorvatioiu If naturalistic methods are to enjoy 
widespread use, the criteria by which value and importance are bestowed upon 

data may need further dolineation within the context of each model* The 
.u> lent^' v>i ^his delineation may result in what Kaplan (1974) has called "the 
iotftna •,)f inmaculate perception." In explaining the importance of values 
\:\ diifi^ctuu; wh^t the inquirer is looking for, Kaplan compares a value-free 
i M^ii'/ one wfueh limits itself to just describing what objectively happens 
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to tUu position Qt the esthetes at the turn of the century, who viewed art 
as a matter of pure form or decoratior;, "at the cost of raaking of it an 
IdLo sonq'for an idle hour, "with no significance for anyone but themselves. 

We may also note that the concept of naturalistic inquiry was first 
introduced as an alterr\ative- methodology to present day conceptions of 
experiraer.tal deaiqn and not as an appr6ach to serve the ends of evaluation. 
Although the authors of naturalistic models have done an exemplary job of 
making this relationship appealing, the match between naturalistic inquiry 
and ovaluation.may not be as great as it might at first seem. The decision- 
orientod context in which most evaluations occur are not always conducive 
to the hypothesis generating and theory building purposes for which naturalistic 
inquiiy is best suited. Some audiences for evaluation studies may appreciate 
bovnq :onfrontGd with "issues" and "concerns." But othex audiences may not be 
so appreciative if specific questions requiring formal measurement and analysis 
ar^^ left unanswered simply because they require altering antecedent conditions 
or oonstrainincj subject responses. It is because of the diversity of what 
clients desire and expect of an evaluation that the word "supplementary" rather 
than "altorna ivo" might be used to place naturalistic methods in its most 

appr.)pr iate frarauwork. 

Fm.Uly, it. is important to note the perspective or mind set the naturalistic 
inHUL.-r carri.js with him whon studyinq behavior. This perspective, or 
w.M' \-v',ohauutbj. (world-viow) has been aptly captured by Louch (1966), who, in 
th.' context of doscribiriq the i.-oLo of explanation in the study of human action, 
Ptwu-les a jood portrayal ot t h(; naturalistic inquirer and the commonality 
i'Mwt-.-n the- natur.ilistic an.i valuo-orif nted approaches to evaluation. In the 
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wor^ki of Louch the v/orid of the naturalistic inquirer is one in which; 

behavior cannot be explained by a mothodo lot] y borrowed \ 
from the physical science.*' For him, "what is needed * 
• ••is not measurement I experiment, prediction, formal 
^ argument but appraisal, detailed description, reflection 
and rhetoric. ... human action is a matter of appraising, 
tho riqhtness or appropriateness of what is attempted 
or achieved by men in each set of circumstances* Its 
dff Initios are with morality rather than with the causal 
or statistical accounts appropriate to the space-time 
framework of the physical sciences. Its methods* are 
akin to tho del liberations and judgments in the law rather 
than the hypotheses and experiments of physics. 
(Van Gigch, 1978, p, 220) 

Systt^ms -oriontod Evdlua tiQn 

Whi le much of the evaluation literature of the past decade focused on dis- 
tinctions betv;een evaluation and research and the insensitivity of the latter to 
detecting the eff^^cts of innovative programs, conceptual models were being 
v,it:v»^ louod intorco«inec;t- ing the plannin*^, development and evaluation process. 
rht- ;t^ TKulols, whi h» wH: distinct from oLhor approaches in their call to infuse the 
sti -^^iplme of t'valaai.ion with a broade^r methodology than research, were dist . \ct 
\i\ thoir ot'fott^j tc; in'luilo witliin the dor .in of evaluation methodologies to 
■j 'Mllv ..mprovt-^ thr} pr():;oss by whi<;h programs werp being planned and developed. 
To K -^innlish ibm piirposo various systematic approaches to instructional develop- 
-VMU W'^r-.^ introduvcd p<j:iina "ftotU". (uid ' or prodovelopmont tasks for the evaluator. 
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unlfyLnq ctad inteqtdtinti iihe p/xoviouq Ly weparcito processes of program planning* 

.■ r ■ . ' ■ 

Oovfr'lopment and ovaluatiom; 

Kaufman {107*J), in thd first modern text dealing with educational planning 

tVom a sy£it,eina perspective > defined syi3t:<:?m as: 

The sum total of parts workinq independently and working 
toqather to achieve required rt^^sults or outcomes, based 
on needs. (p. 1^ 

and two systems approach as: 

A process by which needs are identif it=id, problems selected, 
requirements for problem solution are identified, solutions 
are chosen from alternatives, methods, and means are obtaix\ed 

and implemented, results are evaluated, and required revisions 

\, , , 

to all or part ot the system are made so that the needs are . 
eliminated. ([>. 2) 

The particular svstemri approach articulated by Kaufman represents a type 
ot loLjical problem solving for identifying and resolving educational problems. 
Central to- this approach is the process of educational planning. 

One example of the systems approach applied to planning and evaluation is 
t- ht } lator s»>r vice J^oc fi dures for Instructional Systems Development* (U.S. Army 
1'raj.ninq and ;^octrino Command, 1975), a five-*volume compendium on the "how to 
U> I?" aspf'.ts cu' instrui't ional systems development. While developed for the 
• *. I 1 1 ir'.', this wc^rk rijpr^.vjt^n tr. a broa*l application of the systems approach to 
M iiMiui ust-rul in vnt.ually any lyf- of settino. The Interservice Procedures 
ir- livi.ii.-tl inL(» i iV" t;r'parat(' irul Wi itinct. phases to bo carried out successively. 
:aps<' pnist>., as d( u ribt^l ia tur oxtv-utiive summary of the project are: 

^•^•^•''^ i tni it t ho ont^ir f-^r Ldu' a t iuna I Technology, Florida State University. 
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Phagti I, ANALYZE. This phaso doals with procedures for defining what jobs 
are^ breaking those into statements q£ tasks # and using 
numerical techniques to oombine the bust judc]ment o£ i^xperionced 
professionals to select taiiks for training. Phase Z also 
presunts processes for construction of job performance 
measures and the sharing of occupational and training infor*- 
mat ion within and among client groups. It provides a rationale 
for deciding whether tasks should be trained in schools i on 
the job, or eir^ewhere, and also requires consideration of the 
interactioii between trainin»j and job performance. 

Phase II, DESIGN « This phase deals with the design aspects 

of the training program within selected settings. ^Design is 
considered in the architectural sense in which the form and 
specifications for training are laid down in careful detail. 

Phase II reviews the considerations relating to* entry behavior 
of two separate kinds: general ability i and prior experience. 
A rationale is presented for establishing requirements based 
on the realistic evaluahion of both of these factors. 
Phase III, DEV KLOPMENJT, This phase refers to the actual preparation of 

inatructioiK Determinations are made about how the students will 
be manaqod, thu kinds of learning experiences they will have, the 
activities in which they will onqage, and the form and content 
of the instructional delivery system. Techniques are presented 
tor tiie carrful review and adaptation of existing materials. 
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ProcQduires for the syatomatic dosign.of instruction which can be 
delivored in a variety of media are also included. Phase III 
concludes with a procedure for* testing and evaluating the instruc- 
tion to insure that its performance meets expectations. #; 
t^h ase I V > IMPLEMENTATION , This phase treats the necessary steps to imple- 
ment the instruction according to the plan developed in Phase III. 
Two steps highlight Phaf5e IV, that of training the staff in the 
procedures and problems unique to the specific instruction and 
actually bringing the instruction on-line and operating it. 
The Phase IV effort continuer. as long as there is a need for 
the instruction. 

rhasv.^ CONTR OL ♦ This phase deals with procedures and techniques for 
maintaining instructional quality control standards and for 
providing data from internal and external sources upon which 
revision decisions can be based. Data collection, evaluation 
of the data, and decision making about the implications 

oi the data represent the three principal functions described * 
in Phase V. Hmphasis is placed on the importance of determining 
wiiether the trainees are learning what was intended, and upon 
dt'^'-'^rmi •'1 inq whether what tliey have learned is of benefit in 

cratryirv; out*, pos -trai nirr^ rf»yponsibilitics. 

\'\\y'r.u ph<x'\^ys de\u:t\b(t t.ho funct ion.s nocessary to analyse instructional needs; 

d.vs ) ^jr-., (levelopm^nt , and Lmf.-loment instruction; and maintain quality control 

> 

ut 1 n-u r\u:tiof\. 'U f>ri.mfiry imfxjrfanco is the aequontial relationship of functions 
wvtinii \nd ;^of'w.MMl pha^^<^s, fjiviruj t h i :^ mod<^l its systems perspective 
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Lix a HimiUr numnor. IJic:k and Carey {VM^) ir.tecjrato the procjos- 
sie3 of planning, development and evaluation into a ten ctep approach. These 
steps are: identifyinq instructional cjoals, conducting an instructional 
analysis, identifying entry behaviors and characteristics, writing performance 
objectives, developing criterion V^ferenced tests, developing an instructional 
strategy, developing and select im^Kinstruction, designing and conducting forma- 
tive t^valuation, revising instruction and conducting summative evaluation. 
Their procedure* is described in some 200 pages and 10 chapters explicating each 
ot these processes and integrating Lhem into a single model. Other approaches 
have added still further to the language and conceptual repetiore of the systems 
approach, requiring the evaluator to conduct needs assessments, prepare program 
specifications, perform task and learner analyses and define human and material 
lesources. More than simply terms and concepts these activities represent res- 
ponsibilitic^s which the systems-oriented evaluator is expected to perform. 

Central to the systems approach is the blending of the humanistic and 
behavioraList ic principles of psychology. The systems approach is considered 
human istiic in that it requires measurement of the needs of those the program is 

sorvo. rhrouqh tiio conduct. of needs assessments, the systems approach identi- 
rios di!=Jcropancic.s botwt^on "what is desired" and "what exists" and uses these 
discrepancies to provide direction for program development. Later, through pro- 
.Trcun cvaluati.Mj, tho .syshoms c^pproach determines whether the desired state 
.i.;tu.illy has bot?n a^.-hif-w^d. Needs assessments play a particularly central role 
\n ^he systems approach by linkinq f^roqram design to extant needs fo;: the purpose 
.^f improvinq proqram p«»rf'Mmarr;u . 
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The systeiijs approach derives its concepts and tools from a wide variety of 
disciplines includinq computer science, Qngineeringi raanagemeifiit science and 
economics. These tools are employed in the systeras approach with the primary 
>purpose of asi>urinq that the program does; what it is supposed to do% Accordingly 
a systems approach to program development may specify rather elaborate procedures 
for assuring the accuracy and representativeness of the ^ objectives upon which a 
program is to be based, for analyzing the characteristics of learners and the 
learning task and for monitoring the development process itself* These responsi- 
bilities have rGsultod in the bleindinq into a single approach of concepts pre- 
vLously limited to either the field of instructional development or evaluation. 
Thin representation In a single approach of two previously distinct specialties 
has not b«en without its problems. 

A major question around which some concern exists is whether the evaluator, 
especially formative evaluator, should be distinct from the developer or whether 
those roles represent responsibilities which can be fulfilled by the same-indi- 
vxdudi workinq within the' context of a systems approc»ch. Some evaluators and 

» 

developers warn that when role distinctions become unclear,., as when an evalua-", 
tor defines progreun requirements, conducts needs assessments and performs 

l-art.er and task analyses, role distinctions become unclear and the program 
may nuff-n- from what has come to ho called co-optipn. This refers to the 
,ivi.t' u>n m wiii -h ' Ik' i.-valuator in ho '^mersod in the values, feelings and 
1 .t^-nta ot= the devtiloper that evalualiions are no longer an objective guide to 
IHoqrjin of fectivenesn . On the other hand, some evaluators and developers 
\v;r '•..nan, 10r,8; But.man an.i Fletciier, 1^74) contend that development is so 
s:io;iely tied to evaluation that any si'{)ai.-ation of roles or functions is at 
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beat an artificial cUstinction that may detract from rather than add to the 
development process. The popularity of "third party" or independent sununative 
evaluations has dissipated to some extent the differences between these . , 

perspectives, when they are conducted in addition to formative assessments of 
proqram effectiveness and when the third party sununative evaluator has not been 
provided knowledge of the outcome of the previous formative evaluations. 

While little has been written about the role and function of the evaluator 
within Che context of program planning, development ^nd ..valuation, it is not 
uncommon for a program to be planned and develops.^, in s.v.:v. a way as to -either 
encourage or preclude a certain kind of evaluation or that once the program has 
b..en developed the evaluator is forced to take a certain approach to evaluation 
regardless of its responsiveness to client needs. The systems approach argues, 
however, that the evaluator must, serve critical functions early on in the develop-, 
ment process to prevent just such an eventuality. Some of these "early on" ^ ' 
evaluation activities are addressed below. (For the systems approach see also 
Banathy. 1968; Brxggs. 1077; Davis. Alexander and Yelon. 1975 and Interservice 
l'ro<:odures for Instructional Systems Development, 1975.) 
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IV. Prospects for tho IORO'm; lmi)lieatiwns of tha Emerqinq Ti'ftnds 

Tho foroqoing rcviow of omerginq trouda has attempted to touch upon current 
evaluation theory and practioo. 'Vhis review has many implications for d<jvelopers 
and evaluators of educational programs. These implications are generaliaable 
to a variety of educational contexts be it elementary and seqondary school, 
college, graduate school, inservice education or military training. 

♦ 

The purpose of this concluding section is to present several major implications 
of « the emerging trends. These implications will be discussed generally and 
then illustrated with a specific advancement or change in evaluation practice 
which, in the opinion of the author, is likely to o6cur in the not-to-distant 
future. While the above trends represent an analysis of where evaluation is 
heddinq. the following ift\plications represent signs or examples of the type of 
changes or advances which might be expected to result from these trends. These 
implications fall into the areas of systems approaches, naturalistic observation, 
needs assessment, policy assessment, and the role of the evaluator. 
rmpircrations for a systems approach 

one implication of the emerging trends is that there, is a need for a coherent, 
integrated approach to program planning, development and evaluation. Arguments 
have been presented that planning, development and evaluation can be seen as 
ooinuoiu>at parts of a unitciry process, rather than conceptualized as separate and 
ait;-.inct activities. Program planning, esi -cially, can be conducted with an 
. K>Yv toward program development (which it usually is) and program evaluation (which 
It uoually IS r.ot) . This implication can be reduced to a call for the application 
of A t^yr.tomij approacrh to instructional planning, development and evaluation. 
Kautman (19?:?) proposes that a systems approach to evaluation requires 

G. 
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the application of a vaifiety of tools and techniques borrowed from the fields 
q£ computer science « cybernetics , en^jineering^ management and operations research. 
These include simula^oas^ operational gaming, the Program evaluation and Review 
Technique (PERT) , the Critical Path Method <CPM} , the Delphi technique, and . 
other systems analysis techniques. These tools are essentially modeling approaches 
to problem solving which fall under the rubric of systems analysis. While some of 
theie modeling approaches have a distinct format and purpose # Kaufman (1972) ^ 
advocates the use of graphic models for the general purpose of "displaying (or 
describing) a System and its components and subsystem relationships in a simple 1 
•at-a-glance" format" (p* 16) • , . ^ ♦ 

Some recent developments in the field of general systems theory (Churchman, 1968, 
p, 155) have suggested that modeling as a means of studying a system ipay be useful . 

t * 

for planning, developing and evaluating an educational program. Without guidelines ' 

on how systems modeling can be used to study educational programs i however # it / 

is unlikely that the resulting models will be either communicative or generaiizable 

acrotss settings or applications. One implication for the not-to-distant future 

IS th*-:^ t>merq'^nco of specific systems modeling -techniques for decompos^fig 

or breaking down an instructional program (system) into its component parts prior 

to evaluation. Bloom et al. (1971) have already called for the use of such a technique, 

i:allod a behavior by content matrix (or table of specifications), for understanding 

thoiMturo of a developing program and guiding its evaluation. These authors 

sim.io.st that a breakdown of the loarninq task "provides the specifications for 

* 

t'otmaMve ovaluation and other procedures" (p. 17). 

ki^S'S (1077), Ross and Brackett (1976) and Ross and Schoman U977) , have 
:^uvMost«:Hi that a qooJ :iystcm modeling technique should h=^ve certain specifiable 
proporti.05. While roff^mnq to the development and portrayal of complex systems, the 
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propert.LCiS posited by these authors ar** also applicable to the development cuid 
^nMluation of educational programs. Their reconunenddtions with some extensions 
and modifications to pro<jram evaluation are as follows: 

1. Programs are best studied by building a model which 
expresses an indepth understanding of the program, 
sufficiently precise to serve as the basis for program 
development and evaluation. 

2. Analysis ot any program should be topdown (moving from 
general to specific outcomes), modular (take into consi-- 
deration all component parts) and hierarchic (determine 
how the parts are tied together, i.e, structured). 

3. Program activities should be represented by a diagram 
which shows program components , their interfaces, and 
their place in the hierarchic structure* t 

4. The model-building tecnnique must represent behaviors 
the procjram is to produce, activities the program is 
to provido, and relationships amonv behaviors and 
activities. 

5. All planning, design, development, and evaluation 
decisions should be in writing and avail ible for open 
review to all team specialists. 

rh.»r;^^ authors havf^ flev^^lopod a specific technique, the Structured Analysis 
u\ i Dtit^iciVi Tt?chniqufc> (SADT)^, which meets these requirements and is applicable 
f ht' planninq and evaluation of instructional programs, 
Arh'f h'»ir implicatu 11 o: vieru.-ral nystemii thuory for program evaluation is that 



proqrain cannot be fully undeiutoad unio.sa its reidtionship to the * 
hiyatem in which it operates is known. Systems theory suggests that the 
behavioral changes often attributed to an instt^uctionai program are not due 
to the program alone but the interaction of the program with a milieu of 
variabltjs comprising the environment of which it is a part. Simply put, systems 
theory suggests that more forces are at work than the progra^n in effecting 
program outcomes and the more these other forces can be revealed through specific 
tools, such as progr^a modeling, the greater the possibility of understanding 
and evaluating the program. While sometimes vague and illusive, instructional 
ptoqramH can be described in such a way (i.e, more precisely) that acknowledges 
tl)e complex schema of person to person, person to environment, and environment 
to environment relationships in which they operate. To this end system analytic 
tools general iv and system modeling techniques specifically can be useful in as- 
•;istin<? ov<\luatorn to identify the contextual variables which moderate the effec- 
tiveness of instructional programs. 
Im;? 1 i .-at ions for Nat u ralistic Inquiry 

rt is at this juncture that naturalistic inquiry and systems theory 

'«jm*' r^L' iprocally supportmcj concepts-. Naturalistic inquiry, primarily 
procoduros for observing behavior in naturally occurring settings, can p.^ovide 
a qencral tool with which the evaluator can identify and ultimately record the 

>ntexiural factors which moderate a programme effectiveness. Our increasiu'T 
.iw.ir'M\'»'>s of t.ho miil t i^limrnsionali ty of t:hf3 environment in which programs ope • 
it'll to fh<* df?vo Lopmf'Mf oi* Monerai inuthcMJs by which this environment car 
:>» :.r?tt'»r undo 1*55 too. i , N.it ui'a 1 i sti c inquiry provides oxu^ such method for 

Pt» •♦•nt advar\':tv; m nx*^ h^-^havior-u -itui L-iO'rial ::cien;:tis h<ive made it increasing 
J''.:, 'l* * M:;<irr\.^ u: i !-mpnrtar;t roiu.c-'pts or principles without viewing the 




complok whole of which thoy aie a part. Tho social and bahavioral sciences have« 
in manner of spoakinq, run out of simple solutions. Or, more correctly, they 
have found simple solutions to old problems inadequate in light of recent dis- 
coveries and advancements which have all but nullified many "simple" views of in- 
struction and behavior • Complexity is a fact of life and problem solving techniques 
which recognize this multidimensional environment seem particularly timely. This is why 
simplistic views of educational programs may no longer be credible and why natura- 
listic inquiry coupled with systems theory will be a useful tool for describing 
d program in terms of the larger system in which it operates. It is 
this extrospective - as opposed to introspective - view that brings systems 
theory and the aims of naturalistic inquiry together. Programs should not onXv be de- 
signed but also evaluated from a viewpoint which considers the effects larger 
systems (programs) have on smaller systems (programs). In the language of ,the 
systems approach, extrospective analyses trace program effects to contexts not 
ordinarily included in formal models of evaluation. These models commonly provide 
only for introspective analyses that trace program effects within the bounded 
context of the proqram under consideration. 

Lastly. It IS important, but unfortunate, to note that many programs are 
vifjsiqned, operated and evaluated as though they were ends in themselves without 
.;oniiLd»»rinq that all programs are intended to. Satisfy the requirements of some 
Lii-.jor system of which they are a part — just as the objectives of a child's 
ho;u*work assignment in determined by the objectives of the unit of which it is 
a part and and the unit objectives determined by the subject matter • of which it is a 
[wr*- rind the f^ah j*»ct matter .ietorminea by the objectives of the community who determine 
what I r. '*bo3f' for their children. Hero is where the means--ends relation- 

whw h ofti'n qoor> unnoticed with conventional evaluation models could be 
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um-'ov^rod anU evaluated with the sydLemr> approach. Systems theory and natura- 
llstic iijciuiry provide a basis tot identifying the means-end continuum, detiar- 
mining what program qoals (ends) aro not ends-in- themselves but means to still 
other ends, and whether the means actually justify the ends. 

Consistent with the above notions are the following claims commonly ascribed 
to tho system?; approach (Van Gigch, p* 30) : 

(i) The system approach is indi sponsible in considering the rela- 
tionships of a particular problem to its environmental condi- 
rions and in identifying the factors and variables that affect 
the j3ituatioi\. 

(J) Iho systoms ai.'proach brinqs out in the open inconsistencies 
of objectives when treating the various agents who play a 
part in the programs of the same system, 
(^) The sysr.ems approach provides a useful framework in which 
tho performance of the various systems, subsystems, and the 
wholK* :^y5=;toin can be evaluated. 
(•I) The systems approach and its attendant methodology can be used 
t:o redesign tho existing system and Lo compare and test the 
roLativo worfh of altornatlvo plans. 
In'.:.*l l^ at ion:; f'>r NtM»fl5> A^JSu'SomonL 

A I'-fU I i on of th^.* emorfjinvT trends dtsrives from both the value-oriented 

.4..vl -iv ;tems-'.)ri'UitLH] appr fu'ich'?^^ to evaluation. This implication is that needs 
a ; ' ^'iiT^t^nt ia an •^valuat.uMi activity that should be conducted at all stages of 
: r.'vjr\ni It'v Ln)mtMU . Kautman (1 U2, qivo4 top i.»rtority- to the need^j 

I ...»' ;s:rii'nr ippti>a«-li tr> f>valuation. In<iut»d, in Kaufman's recent writing on 

• t*^'" • :<a .*r r'.an . I'^/V), .u*-. typps of stud: \\»ire posited • The functions of 

'"iV'-.H :-,i;< ty:»'s of nootts sLuoi-"^. jr- • ^ o : identify programs based upon needs (Alpha 



typo), determine solution requiremonts and idenVif:*y solution alternatives (Beta 
typcO , Bolocrt solution stratoqi^^r, from among alternatives (Gamma type), implement 
proqram (Delta type), determine performance effectiveness (Epsilon type), revise 
ar> roiiuired u:eta type) • 

Tu Kaufman, a systems approach is a sequential series (Alpha-Zeta) of needs 
assesr^monts a view which is consistent with a systems 

orientation. This scries is presented in Table 5 along with some planning and 
tivaluation tools suqqostod by Kaufman that are associated with each type. 



rianainq and llvaluation Tools Available for Performing 
Kach ot the Functions; of a System Approach* 



Type of Netids 
Asvn ^ssmi^nt 



Bi»ta Typti 



System Approach 
Function 



Possible Planning Tools 
Associated with Each Function 



Identify problum 
bastid upon needf^* 
Determine uuluMon re- 
(puromoiitr, and i..i<»nt if y 
solution <iltornat ivf^s, 
Styloc:t sf)lut. ion strato- 
tj It*;-; from ariiontj a Itorna-- 



Needs assessment .(Alpha type) 

System analysis, needs analysis, 
bohavloral objectives, front-end 
analysis, performance analysis* 
Systems analysis. Cost-effective- 
ness analysis, PPB{E)S, simulation, 
operations research/analysis , 
mothods-moans selection techni- 



'{UPS > g a ming. 
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(continued) 



4.c^ Dolta Typo 



Implement. 



PERT, CPM, management by 



objective I management by exception. 



^j.O tips i Ion Type 



Doturmino performance 



Testing r assessment i auditing. 



ui fectiveness. 



0.0 Zeta Typo 



Revise as required. 



Discrepancy analysis. 



(Similar to a needs assessment.) 



*from Kaufman, 1977 

Another noteworthy aspect of Kaufman's (1977) approach 'is that it is 
hierarchical with respect to making faulty assumptions and achieving significant 
educational change. Kaufman states that the ©valuator may start at any level 
ot needs assessment but the further from the top (Alpha type) the level of 
entry, the lower the probability of actually -achieving a meaningful change 
in educational practice and the greater the probability of making errors due 
to faulty assumptions. Value -oriented definitions of evaluation would require 
entry at the Alpha level (to identify needs and design the program^ Dscision-orieiited 

definitions gener^aily assume a lower entry level, sometimes as low as Zeta type 
needs assessment (to re!ivise the program) . 

Nho.Ih assessment is usually conceptualized within a much narrower context 
MMr\ I.; rt?tbn:tod by K.iutmtin's Taxonomy. However, a wide variety of needs 
I iso ;smf*nt ty^if^ rochniquo5i and proctKlurt!S are reported in the literature. These 
t.t>ohniquh}S inc;lu(it> qodl sotting and fjoaL rating procedures i strategies for 
ass^^s:unq the current status of -i proqram, discrepancy analysis, priority setting 
mothodf'. And various specialized tufrhniques. (See a review by Witkin, 1977 , 
^^••^li::l:u; -:ieL;e ti^.-nni.iuos and thnir .\<lvantaqos and disadvantages,) 
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A tour th implication of the emerqinq trends is that policy assessments should 

b« conducted early on in the planning process. Often confused with the concept 
of needa. policies represent a distinct area of inquiry which provide the data 
upon which needs studies are based. Policy studies coma before needs studies and 
aro used in deciding the type of needs study that should be conducted. 

U a most qeneral sense, policy assessment is the process by which one 
understands and anticipates the kinds of issues that are expected to result from 
alternative courses of action. These studies systematically examine the effects 
on society that may occur when an educational technology, program or product is 
introduced, extended or modified. Environmental impact statements now mandated 
of chemical processors, automobile producers, steel manufacturers, and airlines 
aru examples of policy assessments. These assessments may be distinguished 
ftom aoods studios by thoir attempt-, to: 

1. Clan fy qoalr. 

2. Identity a;?Humptions behind qoals 

3. Define the consequences of goals, and 

4. Portray alternative courses of action (goals). 

•rho.,e obi.K,-tive.s entail the corollary ta.sks of identifying the parties who will- 
■b.. .ttoctod both -iiroct-ly and indirecUy by the technology, program or product 
and describing the social, institutional, fechnological and economic factors which 
. ,u> .hanqe or bo cha:vjfd by the newly developed technology, program or product. 
/•MUrary to thv quant, i tat; iv*' data which often earmark needs assessment. 
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policy assesismp.at often rtisults in a mix of hard and soft data. Kaufman's (1977) 
Ali'^hvi typt> needs assessment Is, in part, a policy assessment when the inquiry " 
id tocujj d on desirt^d ijoals. However, as Kaufman's taxonomy of needs assessments 
movf»s from a Beta to a Zeta type, the emphafiis shifts from goals to means. As - 
one mo\'t?g down the list of Kaufman's five remaining types, the focus of the 
needs study changes to inv r^asingly reflect means (Beti^* and Gamma types) or take 
means as qivens (Delta, F4psilon and 74eta types). The fundamental difference 
betwtuju policy assessmont and needs assessment is that the former focuses on the 
h^jitiaucy of qoals while the latter focuses on the legitimacy of means as 
illustrated by the fo^l lowing: 



Topic of Policy Assessment 
Should we contain the Soviets in East 
Atr ioa? 

» 

What alternatives are available to our 
pnorgy crisis? 

Should* wt? teavJi vuuMtionai f-.jducation 
iut) j*^'-t.s to .-olloqo bound Htudonts? 



L^houid t:o\":hf\oloqica 1 advancements , 

s-.j h an romputt^r tU:sistod instruction, 

.isfnl in t^;hool.s to perform dirort 
*. r u I \ict lona 1 f unci . 



Topic of Needs Assessment 
How do we contain the Soviets in 
East Africa? 

How do we move oil from Alaska's 
Nortn slope? 

Tn what form should we teach voca*- 
cational educatipn to col.lege bound 
students? 

What technological advancements 
with direct instruction capability 
are suitable for the schools? 



[n o.u'h of the above, policy as:jt-^smenL focusos attention on the goal itself, 
r.t' I J ♦ iinpt ion., and croniWK^aiMicofi undorlvitiq it and/or altc»rnativo courses of 
rh':^i. \\-^\\ y as,s»;sstnont is nu^r*? fundamontai than neoda assessment, 
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dotormininq the area of inquiry for a nc^eds assessment and the. type of needs 
ytudy most appropriate for a particular course of action. A needs assessment 
works most effectively in conjunction with a policy assessment which first must 
determine tho appropriateness of a particular course of action. 

'I'he methodology of policy assessment is perhaps one of the fastest devieloping 
areas m the general field of evaluation. Methodological advancements in this 
aroa have spanned a broad array of qualitative and quantitative techniques often 
combining the two in unique ways. Coates (1976) has reported on a large list 
of tho55e and has provided documentation as to their use. The following list 
focuses on those techniques from his list likely to be used by the evaluator in 
the not-to-distant future: 

I. Trends extrapolation and f uturc-s-related techniques. 

Forecasting of the time of occurence of an event related to a 

particular goal. 

(liencley and Yates, 1974) 
?.. Biuk-benef it and fault-treo analyses. 

The codification of risks and assorted options under varying 

conditions of uncertainty. 

(National Academy of Engineering, 1971) 

Oelpru Lochniquo. 

.:uru-:otiju'- ror.'r.istinq amonq a panol of experts using cycles 
of mt'ormat mn .iii^i f.-'«Hlhack without face-to-face confrontation. 
(Lu..jton<» and TurotT, l^lV'i) 
■1. .i.'-.M'.ar lo, q.uTiinq and .simulat.ioii. 

M<i'. hemat- i. -.iL .ind nonmathemat i^'al tochniquos for developing 



complex stati»in«iu a of futuii? coiulitiono inclutUtui psyoho- 
t'dramatizationa of current and oxistinq states and aimulationa 
of futxu'b 55tates, 
(Abt, 1970) , . . 

Cross impact analysis. 

A process whereby each Individual prediction in a forecast 
is evaluated in relation to the probable truth or falsity 
of other predictions. 

(Monsanto, 1973) ^ 
• MorphoLoqical analysis. 

A process by which all possible questions and answers per- 
taining to a certain problem are exhausted in a large question 
by answer matrix. 
(Zwicky, UJ57) 

• Docision/roiovance tree* , * . 

A mt*tho,i for t»xhaustinq al I possible options and alternatives 
with regard to a particular problc-m. 
(Gordon ot. aL., 1974; Gulick, 1979) . 
V t udq m e n t theory. 

A procoduie which combinos interrogation with statistical 
roqre.ssion to^iiii>iqn weights to aiternav.ive courses of action. 
Uianunond and Summers, 197^^) 
^.'ost -bt?nof It ( i nf^ut/outi)ut ) r>tAvi i * 

A clas.i of im:( jufjmi^tr i^; mod^Us uMufj regression analyses 
t(^i I nt t^r ri»lat i ruj l;o^>t and t>rnduc:r ivity variables. 
r.K'nxtit^i, \ ru?i» al.so •"^rlam^ky and String, 1978) 
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10. Structural and system dynamic modeling. 

^ qroup activity which appllos logical reaaoniug to complex 
issues to determine int»3rrelationJ5hips and networks among the 
elements or a system. 

(Meadows, Meadowri, Randers, and Behrens, 1972) 

Perhaps the single most definable quality linking these techniques is 
the emphasis and importance they place on the participation of a broad mixture 
of experts and laymen who are capable of making judgments about the implications 
of policy. Most of these techniques represent highly democratic processes in 
which individual opinion is heavily weighted. In contradistinction to other 
forms of government where policy alternatives must be weighed against their 
corapatability with an accepted ideology, these policy assessment techniques 
represent democratically oriented approaches, to the generation of alternatives 
and definitions of consequences regardless of their relevance to the existing 
stato of affairs. Thus, they represent relative approaches tu problem solving 
where the criterion is unaffected by any absolute iderology but instead is 
a.^t ined by the best alternative available. This can be both an advantage and 
di«advantaqo m that (a) usually a large number of decision alternatives are 
vjenor ited by these methods (as in brainstorming) , often making systematic data 
talHilation difficult, (b) usually small (but not necessarily insignificant) 
aiireren.:.'e-j can exist amonq them, makinq evaluation of differences between 
.iUernativos ditficulr, and (c:) no appeal to "higher" authority can be made 
to umplify the pro.-...i^s, at least not initially. Only real-world constraints 
arul imF>Ucations th.it can be documented by logic or experience may be entered 
j.^ arrrptibU' data. On the other h<ind those policy tec:hniques (a) provide for' 
i :nixirv.u:> number ot alternative coat ses of action to be discovered, many of 
whi.-h miqht not have b«'en i.'otis i der (id with less democratic methods, (b) represent 

t J 
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the best mix of opinions and viewpoints, dfton ropresonting a consensus opinion 
x^oflacting th^ best features of individual viewpoints, and (c) lead to identifi- 
cation of alternatives for which there' is high probability that practical strate- 
gies , solutions and methods actually exist. 

The importance of the field of policy assessment over the coming decade will 
be directly linked to the extent to which new technological advancements and 
program development projects create new knowledge gaps and the extent to which the 
direct and unanticipated effects of a program proves to be as or more significant 
than the immediate or planned consequences of that program. The probable occurrence 
of both of the above should make policy assessment a major development in the 
field of evaluation in the next decade. 
ftole of the Evaluator 

\ By now the reader is no doubt aware of the close and non-coincidental 
relationships which bind the concepts of systems modeling, naturalistic inquiry, 
and policy and needs assessment. The interrelationships among these four 
conl^epts are never more obvious than when their effects on the field of evalua- 
tion are seen through the role of the evaluator. We now conclude with a 
unifying theme which underpins these four concepts." 

Tho emerging trends depicted in this paper as well as the implications 
<it:ovp lorecast an expanding role for the evaluator. This role will be shaped 
by iru-roasinq tendijncy to dofine evaluation broadly and to include within 
thi » defmiLion c'valuatu^n t^ctivitius that are performed prior to program 
vit*vt»li>{.mer\t . Thn s;yi>ttMn!i .ipproach to e^Vciluation (.joneraily and the concepts of 
r\iitur:il;i;t observation, not'dn aiiS<»ssm<Mit and policy assessment specifically 
r..) some of r.he ways the oval u.j tor Vs role is expanding to include activi- 
p^^rformod early on \\\ tho ;)Idnninq and development process. 
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Tho systems approach to evaluation confars on the' civaluator a broad arpy ^ 
ot; responsibilities heretofor^j divided ainoncj other sptjcialists. While leaving 
tho bulk of the planninq and development work to theso specialists, tfij' systems 
approach places responsibility with tho ©valuator for many quasi-eval)iation or 
concomitant activities which/ while not a direct part of the planning and develop- 
ment process, hold potential for substantially improving the quality of program 
planninq and development. These activities can so influence the design of an 
..valuation that their completion by the evaluator early on in the planning. and 
development procoss may soon become a standard fop good evaluation. An analysis 
of the systems approach can foreshadow some of these "front-end" activities 
thv> evaluator may soon be expected to perform. 

T^a systems approach represents the integration of the planning, development 
and evaluation processes into a single coherent approach, thus implying an 
underlininq thread by which these processes are linked. The responsibility for 
providiaq this ]ink may increasingly fall upon the shoulders of the evaluator. 
Tlu: evaluator may bo pxpect^^d to perform this linking function with quasi-evalua- 
tion activities that accompany tho process of evaluation but which are not 
thomseives part of tho act of detcrmininq the "merit or worth of a thing". Several 
su-h actvvitief. have already omergod such as policy assessments for determining 
ttu! ivlativo consequences of proqram objectives, needs assessments for deter- 
.nvnin.; tho most af)propr i .ite methods, solutions and strategies for meeting pro- 
• jr im >b).)cr. ivt's, sy:;tt>mr. mua<-'iinq tochniqueo for clarifying and refining 
if.Kira;ii ^.esiqn and natural i ir imiuiry for determininc the larger system in 
wt\u:h rhe proqram mu;-,t. operate and, hunc-o, means-ends relationships. While 
' h^'iu' A.-tivit.ios c:<ui .r, iter ia 1 ly contribute to a unified approach to planning, 
•.■«.'.•. I •: tn.'fit ma ev.Unat i.>r\ they ii'*' nt^t tho only ones. The evaluator 
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can assumo many othox' "t'tont end" CunctionH o€ a logical nature whit^h help 
clarify and focus the work of tho planner and developer. Determining the 
representativeness, accuracy and appropriateness of program objectives, the 
logicalness of intended relationships between program components and expected 
outcomes, the congruency of objectives with .planned develoianent activities., 
and the modeling of intended instructional activities and outcomes to depict. 

I 

hierarchical, and sequential relationships among program objectives, are 'Other 
activities implied by an integrated approach to planning, development " evalua 
tion. These activities can be described as preformative activities, that s activi 
ties performed by the evaluator prior to program development. Because preformative 
evaluation flows from the policy and needs assessment process, it can be 
expected to influence all aspects of program planning, development and evaluation. 

Traditionally, program planning, development and evaluation have been viewed 
as distinct roles or functions related in sequence but not substance. Formal 
training in evaluation has not always emphasized concepts of instructional 
design and development and vice versa. While the notion of formative evaluation 
has linked program development to evaluation, it has not related evaluation to 
program planning. Given the concepts of systems theory, naturalistic inquiry, 
and policy and needs assessment, a coherent, unified approach to planning, 
development and evaluation can emerge. The further development of these concepts 
tlv* linkinq of them to the planning, development and evaluation process is an 
important outcome for the field ot evaluation in the decade ahead. 
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Postscript 

The followinq observations were prepared aftor reviewing a small but 
rtn>rosr^ntative sample of studies that have evaluated computer based instruction. 
Usinq.the concepts and milestones discussed in earlier portions of this paper 
as an organizational framework, the evaluation issues, problems and concerns 
raised by these studies were noted and arranged in the following table. This 
table IS intended as an argumentative "think piece** for discussing desired ways 
of evaluating computer basci instruction. 



Some Observations on the Evaluation of Computer Based Instruction (CBI) 
in Relation to Concepts and Milestones in the Fiel»4 of Evaluation 



h\ic:tc>ro Contributing to the 



Growth of Evaluation 



Their Relationship to the Evaluation of 
Computer Based Instruction 



Sooi:^tal Troiids 



So 



For the studies reviewed, the evaluation 
of computer based instxxiction seemed to have 
followed many of the same trends that influ- 
enced the growth and development of the larger 
field of evaluation. Within this context the 
evaluation of CAI/CMI software development 
.soemod most influenced by the principles of 
opi>rationalism and behavioral objectives i 
mcKU^ratGly influenced by the ESEA Legislation 
of L965 and least influenced by the school 
and teacher accountability movement. The use 
of modfils of software development stemming 
froHi the curriculum reform movement could also 
be noted, particularly the team authorship of 
instructional software and the formative evalua- 
tion and pilot testina of orototvoe materials. 
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The evaluations of CBI that were reviewed 
were generally indistinguishable from applied 
research. The applied research character 
of these evaluations seemed to be regarded 
as a strength. and not a factor limiting the 
generalizability or contextual validity of 
Traditional Definitions of the conclusions that could be drawn from 

Evaluation them. While these investigations were 

often called evaluations i they were, in 
essence, applied research studies indis- 
tinguishable from textbook definitions of 
experimental and quasi-experimental research. 
The statistical control of potentially 
confounding (interactive) variables # the 
terminal availability of data, the use 
of *:ontrol groups and statistical decision 
I'ules (e.g., p < .05) were standard ingre- 
dients of these studies. 
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Moviels? for Evaluation 



Most CBl eval nations did not employ 
evaluation models. The applied research 
nature of these studies made the utiliza- 
tion of evaluation models difficult because 
many of these models fell outside the 
traditional definition of applied research. 
Given the applied research nature of these 
studies, the matching of any particular 
evaluation model to a particular CBI 
context was not possible. For example, 
only Provus' "cost-benefit" stage and 
Stufflebeam's "product" stage clearly 
matched the stated purposes of CBI evalua- 
tions, while such concepts as Stake's 
"logical contingency," Stufflebeam's 
"context evaluation" and Provus' "program 
definition" seemed too far removed from 
the conclusion oriented and hypothesis 
testinvj intent of these evaluations to be 
of use. 
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Most studies reviewed seemed to fit 
the PDK National Study Committee's defini- 
tion of f.ducc»tional evaluation stated asi 
•..the process of delineating # 
obtaining, ani providing useful 
information for judging decision 
alternatives. ^ 
The purpose of many of these studies was 
Decision-oriented livaluation to provide decision makers with information 

as to whether previously articulated 
program goals and objectives were being met. 
This purpose seemed to add a dcicidely summa- 
tive emphasis to many evaluations resulting 
in considerably less emphasis on formative 
evaluation and program modification. Some- 
times conclusions were used in an all or 
none manner:, the program was either accepted 
ct reiected, adopted or discontinued. But, 
other times the data were too ambiguous to 
point to such ail-exclusive alternatives, 
and, honce^ were ignored in making recom- 
mondations for program improvements. 
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GDI evaluations seemed to have paid 
little attention to the relationship 
between means and ends. Outcomes studied 
seldom went beyond end-of-instruction 
attitude and achievement indicators and 
seldom examined the extent to which the 
results of the computer based instruction 
Vaiue-orxented Evaluation served larger program or institutional 

goals. Goals and standards for a program 
were generally taken as givens and the 
program judged on the basis of how well 
it met these goals and standards. For 
example, better attitude and higher 
achievement were sometimes considered 
exclusive ends, even when the goal of the 
instruction might have also included the 
reduction of instructional time and 
successful on the job performance. 
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Naturalistic inquiry played little^ if 
anyr role in the CBI evaluations reviewed. 
If naturalistic inquiry is defined as "any 
form of research that aims at discovery and 
verification through observation. " (Wiilems 
and Rauch, 1969, p. 81) or "...slice of life 
episodes documented through natural language 

(Wolf and Tymitz, 1976-1977), little 
in the manner in which these CBI evaluations 
were conducted v;ouid suggest such a theme. 
Thus, naturalistic inquiry as a series of 
observations that art alternately directed 
at the process of "discovery and verification" 
generally did not characterize CBI studies. 
Also, evalaators seldom approached the data 
collection with a minimum of preconceived 
categories or notions of what would be seen 
or as though the phenomena were being ob- 
served for the first time. Thus, data were 
tabulated and analyzed in traditional ways 
winh traditional statistical ttjchniques, 
allowing little to be discovered other 
than that which was expected at the onset of 
the studjr. 
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If the systems approach is viewed as a 
coherent, integrated approach to planning, 
development and evaluation, then few CBI pro- 
jects evidenced this concept. To the contrary, 
planning, development and evaluation were gener- 
ally compartmentalized with the e valuator arriving 
at the end of the development task to fill an 



applied research or summative role. Activities 



ic\enl:ifying entry behaviors and writing per- 



formance objectives^ while sometimes impli- 



citly carried out by program developers and 

> 

managers, generally were not considered part 
of the process of evaluation and seldom were 
conducted in a systematic manner. Other 
clmracteris<tics of the systems approacl , such 
as its capacity to deal siirulta^ieously with 
multiple dimensions of the instructional 
environment, its reliance on program modeling 
to describe this environment, and the concep- 
tuali^atioi. o*" parts within wholes (programs 
within larger programs) , also were not in 
evidence. CBI evaluations seldor* considered 
tho effects of the system in whicli the prog^^-^n 
operated or traced program effec 5 to other 
than the i.*nmediate stimuli under consideration. 



Systems Approach 



such as defining 




analyses f 
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Absent from most CBI evaluations was a 
datermination of the needs and policies upon 
which the design for a particular program was 
ba.<3Gd. This led to the failure of many studies 
to actually document whether the program's per- 
formance met some empirically determined need. 
Also, the policies of the agencies or institu- 
tions in which CBI programs operated, while per- 
Needs and Policy Assessment haps implicitly known, were seldom empirically 

determined. These policies, if documented, 
might have in some instances resulted in the 
selection of different control treatments with 
which the computer based instruction was to be 
comparfid by ruling out the feasibility of some 
forms of instruction for technical, administrative 
or economic reasons. Most importantly, needs 
and policy studies were not used to pinpoint 
tho precise nature of the problem which justi- 
tied development of the computer based instruction 
in the first place. The reason why CBI was consi- 
dered a more reasonable alternative than some 
other mode of instruction was seldom made clear 
and this, in turn, focused evaluations on CBI 
cjenerally rather than on those unique aspects 
of tb'^ medium which could account for its 
superiority over an alternative treatment* 
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Role of the E valuator 



The role of the evaluator was essentially 
one of an applied researcher. Typically the" 
evaluator role consisted of instrument develop- 
ment, data collection, statistical analysis 
and report writing* Seldom did it allow for 
the determination of needs and policies rele- 
vant to the design of a program or for 
determining the ultimate end to which CBI 
was to be the means. .Generally, CBI evalua- 
tions did not define evaluation broadly to 
include preformative activities which could 
determine the needs and justification upon 
which a program might be based • This role 
for the evaluator was most consistent with 

* 

the decision-oriented definition of evaluation 
wherein the goals and objectives for ^ program 
arc taken as givens. ^This role was less 
consistent with the value-oriented defipition 
of evaluation wherein the work of the evalua-- 
tor includes determination of the merit, 
worth or value of the goals and objectives 
themselves. 
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